Part Number Hot Search : 
1N5100 RTL8100C PCA9530D FSDH321 74HCH ASI10 MMBZ5226 TS4604
Product Description
Full Text Search
 

To Download ENHANCEDAM486DX4 Datasheet File

  If you can't view the Datasheet, Please click here to try to view without PDF Reader .  
 
 


  Datasheet File OCR Text:
  this document contains information on a product under development at advanced micro devices. the information is intended to help you evaluate this product. amd reserves the right to change or discontinue work on this proposed product without n otice. advanced micro devices distinctive characteristics n high-performance design improved cache structure supports industry- standard write-back cache frequent instructions execute in one clock 80-million bytes/second burst bus at 25 mhz 105.6-million bytes/second burst bus at 33 mhz 128-million bytes/second burst bus at 40 mhz flexible write-through and write-back address control 0.5-micron cmos process technology dynamic bus sizing for 8-, 16-, and 32-bit buses supports soft reset capability n high on-chip integration 8-kbyte unified code and data cache floating-point unit paged, virtual memory management n enhanced system and power management stop clock control for reduced power consumption industry-standard 2-pin system management in- terrupt (smi) for power management independent of processor operating mode and operating system static design with auto halt power-down support wide range of chipsets supporting smm avail- able to allow product differentiation n complete 32-bit architecture address and data buses all registers 8-, 16-, and 32-bit data types n standard features 3-v core with 5-v tolerant i/o available in dx2 and dx4 versions binary compatible with all am486 ? dx and am486dx2 microprocessors wide range of chipsets and support available through the amd a fusionpc sm program n 168-pin pga package or 208-pin sqfp package n ieee 1149.1 jtag boundary-scan compatibility n supports environmental protection agency's energy star program 3-v operation reduces power consumption up to 40% energy management capability provides excel- lent base for energy-efficient design works with a variety of energy efficient, power managed devices general description the enhanced am486 microprocessor family is an ad- dition to the am486 microprocessor family of products. the new family enhances system performance by incor- porating a write-back cache implementation, flexible clock control, and enhanced smm. table 1 shows avail- able processors in the enhanced am486 microproces- sor family. the enhanced am486 microprocessor family cache al- lows write-back configuration through software and cacheable access control. on-chip cache lines are con- figurable as either write-through or write-back. the enhanced cpu clock control feature permits the cpu clock to be stopped under controlled conditions, allowing reduced power consumption during system in- activity. the smm function is implemented with an indus- try standard two-pin interface. table 1. clocking options cpu type operating frequency bus speed available package dx2 66 mhz 33 mhz 168-pin pga 80 mhz 40 mhz dx4 75 mhz 25 mhz 168-pin pga or 208-pin sqfp 100 mhz 33 mhz 120 mhz 40 mhz 168-pin pga preliminary enhanced am486 ? microprocessor family publication# 19225 rev : c amendment /0 issue date: march 1996
amd 2 preliminary enhanced am486 microprocessor block diagram ads, w/r, d/c, m/io, pcd, pwt, rdy, lock, plock, boff, a20m, breq, hold, hlda, reset, intr, nmi, ferr, up, ignne, smi, smiact, sreset control rom floating- point register file floating- point unit micro-instruction decoded instruction path instruction decode 24 code stream 32 displacement bus 32-byte code queue 2x16 bytes prefetcher 128 32 jtag tdi, tck, tdo, tms pchk, dp3Cdp0 par ity generation and control cache control ken, flush, ahold, cache, eads, inv, wb/wt, hitm bs16, bs8 bus size control burst bus control brdy, blast bus control request sequencer data bus transceivers d31Cd0 writeback buffers 4x32 copyback buffers 4x32 write buffers 4x32 address drivers a31Ca2 be3Cbe0 bus interface clock generator clk clkmul stpclk clock interface 32-bit data bus 32-bit data bus 32-bit linear address barrel shifter register file 24 alu physical address segmentation unit descriptor registers limit and attribute pla paging unit translation lookaside buffer pcd, pwt 2 physical address 24 8-kbyte cache cache unit 32 central and protection test unit voldet v cc ,v ss power plane
3 amd preliminary enhanced am486 microprocessor ordering information standard products amd standard products are available in several packages and operating ranges. the order number (valid combinat ion) is formed by a combination of the elements below. a -120 80486 speed option device number/description valid combinations valid combinations list configurations planned to be suppor ted in volume for this device. consult the local amd sales office to confirm availability of specific valid combinations and to check on newly released combinations. package type a -120 = 120 mhz -100 = 100 mhz -80 = 80 mhz -75 = 75 mhz -66 = 66 mhz 80486 am486 high-performance cpu a = 168-pin pga (pin grid array) s = 208-pin sqfp (shrink quad flat pack) valid combinations 80486 8 cache size v 8 = 8 kbytes -120 -100 -75 dx4 version dx4 = clock-tripled with fpu dx2 = clock-doubled with fpu dx4 8 cache type b = write-back voltage v = v cc is 3 v with 5 v i/o tolerance b vb a 80486 8 -80 -66 dx2 vb s 80486 8 -100 -75 dx4 vb s s = enhanced s s s
enhanced am486 microprocessor amd 4 preliminary table of contents 1 connection diagrams and pin designations ......................................................................................... 8 1.1 168-pin pga (pin grid array) package .......................................................................................... 8 1.2 168-pin pga designations (functional grouping) ......................................................................... 9 1.3 208-pin sqfp (shrink quad flat pack) package ........................................................................ 10 1.4 208-pin sqfp designations (functional grouping) ..................................................................... 11 2 logic symbol ...................................................................................................................................... 12 3 pin description .................................................................................................................................... 13 4 functional description ........................................................................................................................ 18 4.1 overview ....................................................................................................................................... 18 4.2 memory ......................................................................................................................................... 18 4.3 modes of operation ...................................................................................................................... 18 4.3.1 real mode ........................................................................................................................... 18 4.3.2 virtual mode ........................................................................................................................ 18 4.3.3 protected mode ................................................................................................................... 18 4.3.4 system management mode ................................................................................................ 18 4.4 cache architecture ....................................................................................................................... 18 4.4.1 write-through cache .......................................................................................................... 18 4.4.2 write-back cache ............................................................................................................... 18 4.5 write-back cache protocol ........................................................................................................... 19 4.5.1 cache line overview .......................................................................................................... 19 4.5.2 line status and line state .................................................................................................. 19 4.5.2.1 invalid ......................................................................................................................... 19 4.5.2.2 exclusive .................................................................................................................... 19 4.5.2.3 shared ....................................................................................................................... 19 4.5.2.4 modified ..................................................................................................................... 19 4.6 cache replacement descri ption .................................................................................................. 20 4.7 memory configuration ................................................................................................................... 20 4.7.1 cacheability ......................................................................................................................... 20 4.7.2 write-through/write-back ................................................................................................... 20 4.8 cache functionality in write-back mode ...................................................................................... 20 4.8.1 processor-induced actions and state transitions .............................................................. 20 4.8.2 snooping actions and state transitions ............................................................................. 21 4.8.2.1 difference between snooping access cases ............................................................ 21 4.8.2.2 hold bus arbitration implementation ....................................................................... 22 4.8.2.2.1 processor-induced bus cycles ........................................................................ 22 4.8.2.2.2 external read ................................................................................................... 22 4.8.2.2.3 external write ................................................................................................... 22 4.8.2.2.4 hold/hlda external access tim ing .............................................................. 22 4.8.3 external bus master snooping actions ............................................................................... 25 4.8.3.1 snoop miss ................................................................................................................. 25 4.8.3.2 snoop hit to a non-modified line .............................................................................. 25 4.8.4 write-back case ................................................................................................................. 25 4.8.5 write-back and pending access ......................................................................................... 26 4.8.5.1 hold/hlda write-back design considerations ....................................................... 27 4.8.5.2 ahold bus arbitration implementation .................................................................... 28 4.8.5.3 normal write-back ..................................................................................................... 28 4.8.6 reordering of write-backs (ahold) with boff ................................................................. 29 4.8.7 special scenarios for ahold snooping ............................................................................ 30 4.8.7.1 write cycle reordering due to buffering ................................................................... 30 4.8.7.2 boff write-back arbitration implementation ............................................................ 32 4.8.8 boff design considerations .............................................................................................. 32 4.8.8.1 cache line fills ......................................................................................................... 32 4.8.8.2 cache line copy-backs ............................................................................................ 32 4.8.8.3 locked accesses ....................................................................................................... 32
5 amd preliminary enhanced am486 microprocessor 4.8.9 boff during write-back ..................................................................................................... 32 4.8.10 snooping characteristics during a cache line f ill ........................................................... 32 4.8.11 snooping characteristics during a copy-back ................................................................. 32 4.9 cache invalidation and flushing in write-back mode .................................................................. 33 4.9.1 cache invalidation through software .................................................................................. 33 4.9.2 cache invalidation through hardware ................................................................................. 33 4.9.3 snooping during cache flushing ........................................................................................ 34 4.10 burst write .................................................................................................................................. 34 4.10.1 locked accesses .............................................................................................................. 35 4.10.2 serialization ....................................................................................................................... 35 4.10.3 plock operation in write-through mode ........................................................................ 36 5 clock control ...................................................................................................................................... 36 5.1 clock generation .......................................................................................................................... 36 5.2 stop clock ..................................................................................................................................... 36 5.2.1 external interrupts in order of priority ................................................................................. 36 5.3 stop grant bus cycle ................................................................................................................... 36 5.4 pin state during stop grant .......................................................................................................... 37 5.5 clock control state diagram ........................................................................................................ 37 5.5.1 normal state ........................................................................................................................ 37 5.5.2 stop grant state .................................................................................................................. 37 5.5.3 stop clock state .................................................................................................................. 39 5.5.4 auto halt power down state ............................................................................................... 39 5.5.5 stop clock snoop state (cache invalidations) .................................................................... 39 5.5.6 cache flush state ............................................................................................................... 39 6 sreset function ............................................................................................................................... 39 7 system management mode ................................................................................................................ 39 7.1 overview ....................................................................................................................................... 39 7.2 terminology .................................................................................................................................. 40 7.3 system management interrupt processing ................................................................................... 40 7.3.1 system management interrupt processing ......................................................................... 41 7.3.2 smi active (smiact ) .......................................................................................................... 41 7.3.3 smram ............................................................................................................................... 42 7.3.4 smram state save map .................................................................................................... 43 7.4 entering system management mode ............................................................................................ 44 7.5 exiting system management mode .............................................................................................. 44 7.6 processor environment ................................................................................................................. 44 7.7 executing system management mode handler ............................................................................ 45 7.7.1 exceptions and interrupts with system management mode ............................................... 46 7.7.2 smm revisions identifier ..................................................................................................... 46 7.7.3 auto halt restart .............................................................................................................. 47 7.7.4 i/o trap restart ................................................................................................................... 47 7.7.5 i/o trap word ...................................................................................................................... 47 7.7.6 smm base relocation ......................................................................................................... 48 7.8 smm system design considerations ........................................................................................... 48 7.8.1 smram interface ................................................................................................................ 48 7.8.2 cache flushes .................................................................................................................... 49 7.8.3 a20m pin ............................................................................................................................. 49 7.8.4 cpu reset during smm ...................................................................................................... 52 7.8.5 smm and second level write buffers ................................................................................ 52 7.8.6 nested smi and i/o restart ................................................................................................ 52 7.9 smm software considerations ..................................................................................................... 52 7.9.1 smm code considerations ................................................................................................. 52 7.9.2 exception handling ............................................................................................................. 52 7.9.3 halt during smm .................................................................................................................. 53 7.9.4 relocating smram to an address above 1 mbyte ............................................................. 53
amd 6 preliminary enhanced am486 microprocessor 8 test registers 4 and 5 modifications .................................................................................................. 53 8.1 tr4 definition ................................................................................................................................ 53 8.2 tr5 definition ................................................................................................................................ 54 8.3 using tr4 and tr5 for cache testing.......................................................................................... 55 8.3.1 example 1: reading the cache (write-back mode only) ...................................................... 55 8.3.2 example 2: writing the cache .............................................................................................. 55 8.3.3 example 3: flushing the cache ........................................................................................... 55 9 enhanced am486 cpu functional differences .................................................................................. 55 9.1 status after reset ......................................................................................................................... 55 9.2 cache status ................................................................................................................................ 55 10 enhanced am486 cpu identification .................................................................................................. 56 10.1 dx register at reset ................................................................................................................ 56 10.2 cpuid instruction ....................................................................................................................... 56 10.2.1 cpuid timing ................................................................................................................... 56 10.2.2 cpuid operation .............................................................................................................. 56 11 electrical data ..................................................................................................................................... 57 11.1 power and grounding ................................................................................................................. 57 11.1.1 power connections ........................................................................................................... 57 11.1.2 power decoupling recommendations .............................................................................. 57 11.1.3 other connection recommendations ............................................................................... 57 12 package thermal specifications ......................................................................................................... 67 13 physical dimensions ........................................................................................................................... 68 figures figure 1 processor-induced line transitions in write-back mode ....................................................... 20 figure 2 snooping state transitions ..................................................................................................... 21 figure 3 typical system block diagram for hold/hlda bus arbitration ............................................ 22 figure 4 external read ......................................................................................................................... 23 figure 5 external write .......................................................................................................................... 23 figure 6 snoop of on-chip cache that does not hit a line ................................................................ 24 figure 7 snoop of on-chip cache that hits a non-modified line ........................................................ 24 figure 8 snoop that hits a modified line (write-back) ........................................................................ 25 figure 9 write-back and pending access ............................................................................................. 26 figure 10 valid hold assertion during write-back ............................................................................... 27 figure 11 closely coupled cache block diagram .................................................................................. 28 figure 12 snoop hit cycle with write-back ............................................................................................. 29 figure 13 cycle reordering with boff (write-back) .............................................................................. 30 figure 14 write reordering due to buffering .......................................................................................... 31 figure 15 latest snooping of copy-back ................................................................................................ 33 figure 16 burst write .............................................................................................................................. 34 figure 17 burst read with boff assertion ............................................................................................ 34 figure 18 burst write with boff assertion ............................................................................................. 35 figure 19 entering stop grant state ....................................................................................................... 37 figure 20 recognition of inputs when exiting stop grant state ............................................................. 38 figure 21 stop clock state machine ....................................................................................................... 38 figure 22 basic smi interrupt service ..................................................................................................... 40 figure 23 basic smi hardware interface.................................................................................................. 41 figure 24 smi timing for servicing an i/o trap ...................................................................................... 41 figure 25 smiact timing ....................................................................................................................... 42 figure 26 redirecting system memory address to smram ................................................................... 42 figure 27 transition to and from smm .................................................................................................... 44 figure 28 auto halt restart register offset .......................................................................................... 47 figure 29 i/o instruction restart register offset .................................................................................... 47
7 amd preliminary enhanced am486 microprocessor figure 30 smm base slot offset ............................................................................................................. 48 figure 31 sram usage .......................................................................................................................... 48 figure 32 smram location .................................................................................................................... 49 figure 33 smm timing in systems using non-overlaid memory space and write-through mode with caching enabled during smm.......................................................................................... 50 figure 34 smm timing in systems using non-overlaid memory spaces and write-back mode with caching enabled during smm ................................................................................................. 50 figure 35 smm timing in systems using non-overlaid memory spaces and write-back mode with caching disabled during smm ................................................................................................ 50 figure 36 smm timing in systems using overlaid memory space and write-through mode with caching enabled during smm ................................................................................................. 51 figure 37 smm timing in systems using overlaid memory spaces and write-through mode with caching disabled during smm ................................................................................................ 51 figure 38 smm timing in systems using overlaid memory spaces and configured in write-back mode...................................................................................................................... 51 figure 39 clk waveforms ...................................................................................................................... 63 figure 40 output valid delay timing ...................................................................................................... 63 figure 41 maximum float delay timing .................................................................................................. 64 figure 42 pchk valid delay timing ....................................................................................................... 64 figure 43 input setup and hold timing ................................................................................................... 65 figure 44 rdy and brdy input setup and hold timing ........................................................................ 65 figure 45 tck waveforms ...................................................................................................................... 66 figure 46 test signal timing diagram .................................................................................................... 66 ta bles table 1 clocking options ....................................................................................................................... 1 table 2 eads sample time ................................................................................................................ 14 table 3 cache line organization ......................................................................................................... 19 table 4 legal cache line states ......................................................................................................... 19 table 5 mesi cache line status ......................................................................................................... 20 table 6 key to switching waveforms ................................................................................................... 22 table 7 wbinvd/invd special bus cycles ......................................................................................... 33 table 8 flush special bus cycles ..................................................................................................... 34 table 9 pin state during stop grant bus state .................................................................................... 37 table 10 smram state save map ........................................................................................................ 43 table 11 smm initial cpu core register settings ................................................................................. 45 table 12 segment register initial states ............................................................................................... 45 table 13 system management mode revision identifier ....................................................................... 46 table 14 smm revision identifier bit definitions ................................................................................... 46 table 15 halt auto restart configuration ............................................................................................ 47 table 16 i/o trap word configuration ................................................................................................... 47 table 17 test register (tr4) ................................................................................................................. 53 table 18 test register (tr5) ................................................................................................................. 53 table 19 cpu id codes ......................................................................................................................... 56 table 20 cpuid instruction description ................................................................................................. 56 table 21 thermal resistance (c/w) q jc and q ja for the am486 cpu in 168-pin pga package ......... 67 table 22 maximum t a at various airflows in c .................................................................................... 67
amd 8 preliminary enhanced am486 microprocessor 1 connection diagrams and pin designations 1.1 168-pin pga (pin grid array) package abcdefghjklmnpqrs abcdefghjklmnpqrs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 d20 d19 d11 d9 v ss dp1 v ss v ss inc v ss v ss v ss d2 d0 a31 a28 a27 d22 d21 d18 d13 v cc d8 v cc d3 d5 v cc d6 v cc d1 a29 v ss a25 a26 tck v ss clk d17 d10 d15 d12 dp2 d16 d14 d7 d4 dp0 a30 a17 v cc a23 d23 v ss v cc a19 v ss voldet dp3 v ss v cc d13 vcc d8 vcc d3 d5 vcc d6 vcc d1 a29 a21 a18 a14 d24 d25 d27 d13 vcc d8 vcc d3 d5 vcc d6 vcc d1 a29 a24 v cc v ss v ss v cc d26 d13 vcc d8 vcc d3 d5 vcc d6 vcc d1 a29 a22 a15 a12 d29 d31 d28 d13 vcc d8 vcc d3 d5 vcc d6 vcc d1 a29 a20 v cc v ss v ss v cc d30 d13 vcc d8 vcc d3 d5 vcc d6 vcc d1 a29 a16 v cc v ss tdi tms ferr a2 v cc v ss inv smi sreset vcc d8 vcc d3 d5 vcc d6 vcc d1 a29 a13 v cc v ss v ss v cc up d13 vcc d8 vcc d3 d5 vcc d6 vcc d1 a29 a9 v cc v ss hitm cache smiact a5 a11 v ss inc wb/wt inc a7 a8 a10 ignne nmi flush a20m hold ken stpclk brdy be2 be0 pwt d/c lock hlda breq a3 a6 intr tdo reset bs8 v cc rdy v cc v cc be1 v cc v cc v cc m/io v cc plock blast a4 ahold eads bs16 boff v ss be3 v ss v ss pcd v ss v ss v ss w/r v ss pchk clkmul ads pin side view
enhanced am486 microprocessor amd 9 preliminary 1.2 168-pin pga designations (functional grouping) address data control test inc v cc v ss pin name pin no. pin name pin no. pin name pin no. pin name pin no. pin no. pin no. pin no. a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19 a20 a21 a22 a23 a24 a25 a26 a27 a28 a29 a30 a31 q-14 r-15 s-16 q-12 s-15 q-13 r-13 q-11 s-13 r-12 s-7 q-10 s-5 r-7 q-9 q-3 r-5 q-4 q-8 q-5 q-7 s-3 q-6 r-2 s-2 s-1 r-1 p-2 p-3 q-1 d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 d13 d14 d15 d16 d17 d18 d19 d20 d21 d22 d23 d24 d25 d26 d27 d28 d29 d30 d31 p-1 n-2 n-1 h-2 m-3 j-2 l-2 l-3 f-2 d-1 e-3 c-1 g-3 d-2 k-3 f-3 j-3 d-3 c-2 b-1 a-1 b-2 a-2 a-4 a-6 b-6 c-7 c-6 c-8 a-8 c-9 b-8 a20m ads ahold be0 be1 be2 be3 blast boff brdy breq bs8 bs16 cache clk clkmul d/c dp0 dp1 dp2 dp3 eads ferr flush hitm hlda hold ignne intr inv ken lock m/io nmi pcd pchk plock pwt rdy reset smi smiact sreset stpclk up voldet wb/wt w/r d-15 s-17 a-17 k-15 j-16 j-15 f-17 r-16 d-17 h-15 q-15 d-16 c-17 b-12 c-3 r-17 m-15 n-3 f-1 h-3 a-5 b-17 c-14 c-15 a-12 p-15 e-15 a-15 a-16 a-10 f-15 n-15 n-16 b-15 j-17 q-17 q-16 l-15 f-16 c-16 b-10 c-12 c-10 g-15 c-11 s-4 b-13 n-17 tck tdi tdo tms a-3 a-14 b-16 b-14 a-13 c-13 j-1 b-7 b-9 b-11 c-4 c-5 e-2 e-16 g-2 g-16 h-16 k-2 k-16 l-16 m-2 m-16 p-16 r-3 r-6 r-8 r-9 r-10 r-11 r-14 a-7 a-9 a-11 b-3 b-4 b-5 e-1 e-17 g-1 g-17 h-1 h-17 k-1 k-17 l-1 l-17 m-1 m-17 p-17 q-2 r-4 s-6 s-8 s-9 s-10 s-11 s-12 s-14 notes: voldet is connected internally to v ss . inc = internal no connect
amd 10 preliminary enhanced am486 microprocessor 1.3 208-pin sqfp (shrink quad flat pack) package top view
enhanced am486 microprocessor amd 11 preliminary 1.4 208-pin sqfp designations (functional grouping) address data control test inc v cc v ss pin name pin no. pin name pin no. pin name pin no. pin name pin no. pin no. pin no. pin no. a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19 a20 a21 a22 a23 a24 a25 a26 a27 a28 a29 a30 a31 202 197 196 195 193 192 190 187 186 182 180 178 177 174 173 171 166 165 164 161 160 159 158 154 153 152 151 149 148 147 d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 d13 d14 d15 d16 d17 d18 d19 d20 d21 d22 d23 d24 d25 d26 d27 d28 d29 d30 d31 144 143 142 141 140 130 129 126 124 123 119 118 117 116 113 112 108 103 101 100 99 93 92 91 87 85 84 83 79 78 75 74 a20m ads ahold be0 be1 be2 be3 blast boff brdy breq bs8 bs16 cache clk clkmul d/c dp0 dp1 dp2 dp3 eads ferr flush hitm hlda hold ignne intr inv ken lock m/io nmi pcd pchk plock pwt rdy reset smi sreset stpclk smiact up wb/wt w/r 47 203 17 31 32 33 34 204 6 5 30 8 7 70 24 11 39 145 125 109 90 46 66 49 63 26 16 72 50 71 13 207 37 51 41 4 206 40 12 48 65 58 73 59 194 64 27 tck tdi tdo tms 18 168 68 167 3 67 96 127 2 9 14 19 20 22 23 25 29 35 38 42 44 45 54 56 60 62 69 77 80 82 86 89 95 98 102 106 111 11 4 121 128 131 133 134 136 137 139 150 155 162 163 169 172 176 179 183 185 188 191 198 200 205 1 10 15 21 28 36 43 52 53 55 57 61 76 81 88 94 97 104 105 107 11 0 11 5 120 122 132 135 138 146 156 157 170 175 181 184 189 199 201 208 note: i nc = internal no connect
amd 12 preliminary enhanced am486 microprocessor 2 logic symbol dp3Cdp0 a31Ca4 clk a20m m/io enhanced am486 cpu w/r d/c 28 2 lock 4 be3 Cbe0 clock address bus bus cycle definition address plock bs8 bs16 ads rdy bus cycle control 32 4 intr nmi reset interrupts pchk a3Ca2 brdy blast pwt pcd ken flush eads ahold data parity data bus burst control page cacheability invalidation cache control/ d31Cd0 tms tdi tdo tck ieee test port access ferr ignne numeric error reporting bus arbitration breq hold hlda boff cache clkmul clock multiplier mask hitm inv smi smiact smm sreset stpclk stop clock up upgrade voldet voltage detect present wb/wt
enhanced am486 microprocessor amd 13 preliminary 3 pin description the enhanced am486 microprocessor family adds ten signals to those used by the am486dx processor. these added signals support the enhanced processor features and are indicated as new in the pin descrip- tion titles. some am486dx cpu signals have new func- tions to implement the enhanced am486 processor write-back cache protocol. these signals are indicated as modified in the pin description titles. all other pro- cessor signals provide the same f unctionality as the am486dx processor. a20m address bit 20 mask (active low; input) a low signal on the a20m pin causes the microproces- sor to mask address line a20 before performing a lookup to the internal cache, or driving a memory cycle on the bus. asserting a20m causes the processor to wrap the address at 1 mbyte, emulating real mode operation. the signal is asynchronous, but must meet setup and hold times t 20 and t 21 for recognition during a specific clock. during normal op- eration, a20m should be sampled high at the falling edge of reset. a31Ca4/a3Ca2 address lines (inputs/outputs)/(outputs) pins a31Ca2 define a physical area in memory or indi- cate an input/output (i/o) device. address lines a31Ca4 drive addresses into the microprocessor to perform cache line invalidations. input signals must meet setup and hold times t 22 and t 23 . a31Ca2 are not driven during bus or address hold. ads address status (active low; output) a low output from this pin indicates that a valid bus cycle definition and ad dress are available on the cycle definition lines and address bus. ads is driven active by the same clock as the addresses. ads is active low and is not driven during bus hold. ahold C modified address hold (active high; input) the external system may assert ahold to perform a cache snoop. in response to the assertion of ahold, the microprocessor stops driving the address bus a31C a2 in the next clock. the data bus remains active and data can be transferred for previously issued read or write bus cycles during address hold. ahold is recog- nized even during reset and lock . the earliest that ahold can be deasserted is two clock cycles after eads is asserted to start a cache snoop. if hitm is activated due to a cache snoop, the microprocessor completes the current bus activity and then asserts ads and drives the address bus while ahold is active. this starts the write-back of the modified line that was the target of the snoop. be3 C be0 byte enable (active low; outputs) the byte enable pins indicate which bytes are enabled and active during read or write cycles. during the first cache fill cycle, however, an external system should ig- nore these signals and assume that all bytes are active. n be3 for d31Cd24 n be2 for d23Cd16 n be1 for d15Cd8 n be0 for d7Cd0 be3 Cbe0 are active low and are not driven during bus hold. blast C modified burst last (active low; output) burst last goes low to tell the cpu that the next brdy signal completes the burst bus cycle. blast is active for both burst and non-burst cycles. blast is active low and is not driven during a bus hold. boff back off (active low; input) this input signal forces the microprocessor to float all pins normally floated during hold, but hlda is not as- serted in response to boff . boff has higher priority than rdy or brdy ; if both are returned in the same clock, boff takes effect. the microprocessor remains in bus hold until boff goes high. if a bus cycle is in progress when boff is asserted, the cycle restarts. boff must meet setup and hold times t 18 and t 19 for proper operation. boff has an internal weak pull-up. brdy burst ready input (active low; input) the brdy signal performs the same function during a burst cycle that rdy performs during a non-burst cycle. brdy indicates that the external system has presented valid data in response to a read, or that the external system has accepted data in response to write. brdy is ignored when the bus is idle and at the end of the first clock in a bus cycle. brdy is sampled in the second and sub- sequent clocks of a burst cycle. the data presented on the data bus is strobed into the microprocessor when brdy is sampled active. if rdy is returned simulta- neously with brdy , brdy is ignored and the cycle is converted to a non-burst cycle. brdy is active low and has a small pull-up resistor, and must satisfy the setup and hold times t 16 and t 17 . breq internal cycle pending (active high; output) breq indicates that the microprocessor has generated a bus request internally, whether or not the microproces- sor is driving the bus. breq is active high and is floated only during three-state test mode. (see flush .)
enhanced am486 microprocessor amd 14 preliminary bs8 /bs16 bus size 8 (active low; input)/ bus size 16 (active low; input) the bs8 and bs16 signals allow the processor to op- erate with 8-bit and 16-bit i/o devices by running multi- ple bus cycles to respond to data requests: four for 8- bit devices, and two for 16-bit devices. the bus sizing pins are sampled every clock. the microprocessor sam- ples the pins every clock before rdy to determine the appropriate bus size for the requesting device. the sig- nals are active low input with internal pull-up resistors, and must satisfy setup and hold times t 14 and t 15 for correct operat ion. bus sizing is not permitted during copy-back or write-back operation. bs8 and bs16 are ignored during copy-back or write-back cycles. cache C new internal cacheability (active low; output) in write-through mode, this signal al ways fl oats. in write- back mode for processor-initiated cycles, a low output on this pin indicates that the current read cycle is cache- able, or that the current cycle is a burst write-back or copy-back cycle. if the cache signal is driven high during a read, the processor will not cache the data even if the ken pin signal is asserted. if the processor deter- mines that the data is cacheable, cache goes active when ads is asserted and remains in that state until the next rdy or brdy is asserted. cache floats in re- sponse to a boff or hold request. clk C modified clock (input) the clk input provides the basic microprocessor timing signal. the clkmul input selects the multiplier value used to generate the internal operating frequency for the enhanced am486 microprocessor family. all exter- nal timing parameters are specified with respect to the rising edge of clk. the clock signal passes through an internal phase-lock loop (pll). clkmul C new clock multiplier (input) the microprocessor samples the clkmul input signal at reset to determine the design operating frequency. an internal pull-up resistor connects to v cc , which se- lects clock-tripled mode if the input is high or left floating. for clock-doubled m ode, the input must be pulled low. for dx2 versions, this input must always be connected to v ss to ensure correct operation. d31 C d0 data lines (inputs/outputs) lines d31Cd0 define the data bus. the signals must meet setup and hold times t 22 and t 23 for proper read op- erations. these pins are driven during the second and sub- sequent clocks of write cycles. d/c data/control (output) this bus cycle definition pin distinguishes memory and i/o data cycles from control cycles. the control cycles are: n interrupt acknowledge n halt/special cycle n code read (instruction fetching) dp3 C dp0 data parity (inputs/outputs) data parity is generated on all write data cycles with the same timing as the data driven by the microprocessor. even parity information must be driven back into the microprocessor on the data parity pins with the same timing as read information to ensure that the processor uses the correct parity check. the signals read on t hese pins do not affect program execution. input signals must meet setup and hold times t 22 and t 23 . dp3Cdp0 should be connected to v cc through a pull-up resistor in systems not using parity. dp3Cdp0 are active high and are driven during the second and subsequent clocks of write cycles. eads C modified external address strobe (active low; input) this signal indicates that a valid external address has been driven on the address pins a31Ca4 of the micro- processor to be used for a cache snoop. this signal is recognized while the processor is in hold (hlda is driv- en active), while forced off the bus with the boff input, or while ahold is asserted. the microprocessor ig- nores eads at all other times. eads is not recognized if hitm is active, nor during the clock after ads , nor during the clock after a valid assertion of eads . snoops to the on-chip cache must be completed before another snoop cycle is initiated. table 2 describes eads when first sampled. eads can be asserted every other clock cycle as long as the hold remains active and hitm re- mains inactive. inv is sampled in the same clock period that eads is asserted. eads has an internal weak pull-up. note: the triggering signal (ahold, hold, or boff ) must remain active for at least 1 clock after eads to ensure proper operation. table 2. eads sample time trigger eads first sampled ahold second clock after ahold asserted hold first clock after hlda asserted boff second clock after boff asserted
enhanced am486 microprocessor amd 15 preliminary ferr floating-point error (active low; output) driven active when a floating-point error occurs, ferr is similar to the error pin on a 387 math coprocessor. ferr is included for compatibility with systems using dos-type floating-point error reporting. ferr is active low, and is not floated during bus hold, except during three-state test mode (see flush ). flush cache flush (active low; input) in write-back mode, flush forces the microprocessor to write-back all modified cache lines and invalidate its internal cache. the microprocessor generates two flush acknowledge special bus cycles to indicate completion of the write-back and invalidation. in write-through mode, flush invalidates the cache without issuing a special bus cycle. flush is an active low input that needs to be asserted only for one clock. flush is asyn- chronous, but setup and hold times t 20 and t 21 must be met for recognition in any specific clock. sampling flush low in the clock before the falling edge of re- set causes the microprocessor to enter three-state test mode. hitm C new hit modified line (active low; output) in write-back mode (wb/wt =1 at reset), hitm indi- cates that an external snoop cache tag comparison hit a modified line. when a snoop hits a modified line in the internal cache, the microprocessor asserts hitm two clocks after eads is asserted. the hitm signal stays asserted (low) until the last brdy for the correspond- ing write-back cycle. at all other times, hitm is deas- serted (high). during reset, the hitm signal can be used to detect whether the cpu is operating in write- back mode. in write-back mode (wb/wt =1 at reset), hitm is deasserted (driven high) until the first snoop that hits a modified line. in write-through mode, hitm floats at all times. hlda hold acknowledge (active high; output) the hlda signal is activated in response to a hold re- quest presented on the hold pin. hlda indicates that the microprocessor has given the bus to another local bus master. hlda is driven active in the same clock in which the microprocessor floats its bus. hlda is driven inactive when leaving bus hold. hlda is active high and remains driven during bus hold. hlda is floated only during three-state test mode. (see flush .) hold bus hold request (active high; input) hold gives control of the microprocessor bus to anoth- er bus master. in response to hold going active, the microprocessor floats most of its output and input/output pins. hlda is asserted after completing the current bus cycle, burst cycle, or sequence of locked cycles. the microprocessor remains in this state until hold is deas- serted. hold is active high and does not have an in- ternal pull-down resistor. hold must satisfy setup and hold times t 18 and t 19 for proper operation. ignne ignore numeric error (active low; input) when this pin is asserted, the enhanced am486 micro- processor will ignore a numeric error and continue ex- ecuting non-control floating-point instructions. when ignne is deasserted, the enhanced am486 micropro- cessor will freeze on a non-control floating-point instruc- tion if a previous floating-point instruction caused an error. ignne has no effect when the ne bit in control register 0 is set. ignne is active low and is provided with a small internal pullup resistor. ignne is asynchro- nous but must meet setup and hold times t 20 and t 21 to ensure recognition in any specific clock. intr maskable interrupt (active high; input) when asserted, this signal indicates that an external interrupt has been generated. if the internal interrupt flag is set in eflags, active interrupt processing is ini- tiated. the microprocessor generates two locked inter- rupt acknowledge bus cycles in response to the intr pin going active. intr must remain active until the in- terrupt acknowledges have been performed to ensure that the interrupt is recognized. intr is active high and is not provided with an internal pull-down resistor. intr is asynchronous, but must meet setup and hold times t 20 and t 21 for recognition in any specific clock. inv C new invalidate (active high; input) the external system asserts inv to invalidate the cache-line state when an external bus master proposes a write. it is sampled together with a31Ca4 during the clock in which eads is active. inv has an internal weak pull-up. inv is ignored in write-through mode. ken cache enable (active low; input) ken determines whether the current cycle is cacheable. when the microprocessor generates a cacheable cycle and ken is active one clock before rdy or brdy during the first transfer of the cycle, the cycle becomes a cache line fill cycle. returning ken active one clock before rdy during the last read in the cache line fill causes the line to be placed in the on-chip cache. ken is active low and is provided with a small internal pull-up resistor. ken must satisfy setup and hold times t 14 and t 15 for proper operation.
enhanced am486 microprocessor amd 16 preliminary lock bus lock (active low; output) a low output on this pin indicates that the current bus cycle is locked. the microprocessor ignores hold when lock is asserted (although it does acknowledge ahold and boff ). lock goes active in the first clock of the first locked bus cycle and goes inactive after the last clock of the last locked bus cycle. the last locked cycle ends when rdy is returned. lock is active low and is not driven during bus hold. locked read cycles are not transformed into cache f ill cycles if ken is active. m/io memory/input-output (active high/active low; output) a high output indicates a memory cycle. a low output indicates an i/o cycle. nmi non-maskable interrupt (active high; input) a high nmi input signal indicates that an external non- maskable interrupt has occurred. nmi is rising-edge sensitive. nmi must be held low for at least four clk periods before this rising edge. the nmi input does not have an internal pull-down resistor. the nmi input is asynchronous, but m ust meet setup and hold times t 20 and t 21 for recognition in any specific clock. pcd page cache disable (active high; output) this pin reflects the state of the pcd bit in the page table entry or page directory entry (programmable through the pcd bit in cr3). if paging is disabled, the cpu ignores the pcd bit and drives the pcd output low. pcd has the same timing as the cycle definition pins (m/io , d/c , and w/r ). pcd is active high and is not driven during bus hold. pcd is masked by the cache dis- able bit (cd) in control register 0 (cr0). pchk parity status (active low; output) parity status is driven on the pchk pin the clock after rdy for read operations. the parity status reflects data sam- pled at the end of the previous clock. a low pchk indicates a parity error. parity status is checked only for enabled bytes as is indicated by the byte enable and bus size signals. pchk is valid only in the clock immediately after read data is returned to the microprocessor; at all other times pchk is inactive high. pchk is floated only during three-state test mode. (see flush .) plock C modified pseudo-lock (active low; output) in write-back mode, the processor f orces the output high and the signal is always read as inactive. in write- through mode, plock operates normally. when as- serted, plock indicates that the current bus transac- tion requires more than one bus cycle. examples of such operations are segment table descriptor reads (8 bytes) and cache line fills (16 bytes). the microproces- sor drives plock active until the addresses for the last bus cycle of the transaction have been driven, whether or not rdy or brdy is returned. plock is a function of the bs8 , bs16 , and ken inputs. plock should be sampled on the clock when rdy is returned. plock is active low and is not driven during bus hold. pwt page write-through (active high; output) this pin reflects the state of the pwt bit in the page table entry or page directory entry (programmable through the pwt bit in cr3). if paging is disabled, the cpu ignores the pwt bit and drives the pwt output low. pwt has the same timing as the cycle definition pins (m/io , d/c , and w/r ). pwt is active high and is not driven during bus hold. reset reset (active high; input) reset forces the microprocessor to initialize. the mi- croprocessor cannot begin instruction execution of in- structions until at least 1 ms after v cc and clk have reached their proper dc and ac specifications. to ensure proper microprocessor operation, the reset pin should re- main active during this time. reset is active high. reset is asynchronous but must meet setup and hold times t 20 and t 21 to ensure recognition on any specific clock. rdy non-burst ready (active low; input) a low input on this pin indicates that the current bus cycle is complete, that is, either the external system has presented valid data on the data pins in response to a read, or, the external system has accepted data from the micro- processor in response to a write. rdy is ignored when the bus is idle and at the end of the bus cycles first clock. rdy is active during address hold. data can be returned to the processor while ahold is active. rdy is active low and does not have an internal pull-up resistor. rdy must satisfy setup and hold times t 16 and t 17 for proper chip operation. smi C new smm interrupt (active low; input) a low signal on the smi pin signals the processor to enter system management mode (smm). smi is the highest level processor interrupt. the smi signal is recognized on an in- struction boundary, similar to the nmi and intr signals. smi is sampled on every rising clock edge. smi is a falling- edge sensitive input. recognition of smi is guaranteed in a specific clock if it is asserted synchronously and meets the setup and hold times. if smi is asserted asynchronous- ly, it must go high for a minimum of two clocks before going low, and it must remain low for at least two clocks to guar-
enhanced am486 microprocessor amd 17 preliminary antee recognition. when the cpu recognizes smi , it en- ters smm before executing the next instruction and saves internal registers in smm space. smiact C new smm interrupt active (active low; output) smiact goes low in response to smi . it indicates that the processor is operating under smm control. smiact remains low until the processor receives a reset sig- nal or executes the resume instruction (rsm) to leave smm. this signal is always driven. it does not float dur- ing bus hold or boff . note: do not use sreset to exit from smm. the sys- tem should block sreset during smm. sreset C new soft reset (active high; input) the cpu samples sreset on every rising clock edge. if sreset is sampled active, the sreset sequence begins on the next instruction boundary. sreset re- sets the processor, but, unlike reset, does not cause it to sample up or wb/wt , or affect the fpu, cache, cd and nw bits in cr0, and smbase. sreset is asyn- chronous and must meet the same timing as reset. stpclk C new stop clock (active low; input) a low input signal indicates a request has been made to turn off the clk input. when the cpu reco gnizes a stpclk , the processor: n stops execution on the next instruction boundary (unless superseded by a higher priority interrupt). n empties all internal pipelines and write buffers. n generates a stop grant acknowledge bus cycle. stpclk is active low and has an internal pull-up re- sistor. stpclk is asynchronous, but it must meet setup and hold times t 20 and t 21 to ensure recognition in any specific clock. stpclk must remain active until the stop clock special bus cycle is issued and the system returns either rdy or brdy . tck test clock (input) test clock provides the clocking function for the jtag boundary scan feature. tck clocks state information and data into the component on the rising edge of tck on tms and tdi, respectively. data is clocked out of the component on the falling edge of tck on tdo. tdi test data input (input) tdi is the serial input that shifts jtag instructions and data into the tested component. tdi is sampled on the rising edge of tck during the shift-ir and the shift- dr tap (test access port) controller states. during all other tap controller states, tdi is ignored. tdi uses an internal weak pull-up. tdo test data output (active high; output) tdo is the serial output that shifts jtag instructions and data out of the component. tdo is driven on the falling edge of tck during the shift-ir and shift-dr tap controller states. otherwise, tdo is three-stated. tms test mode select (active; high input) tms is decoded by the jtag tap to select the operation of the test logic. tms is sampled on the rising edge of tck. to guarantee deterministic behavior of the tap controller, the tms pin has an internal pull-up resistor. up write/read (input) the processor samples the upgrade present (up ) pin in the clock before the falling edge of reset. if it is low, the processor three-states its outputs immediately. up must remain asserted to keep the processor inactive. the pin uses an internal pull-up resistor. voldet C new (168-pin pga package only) voltage detect (output) voldet provides an external signal to allow the system to determine the cpu input power level (3 v or 5 v). for enhanced am486 processors, the pin ties internally to v ss . wb/wt C new write-back/write-through (input) if the processor samples wb/wt high at reset, the processor is configured in write-back mode and all sub- sequent cache line fills sample wb/wt on the same clock edge in which it finds either rdy or the first brdy of a burst transfer to determine if the cache line is designated as write- back mode or write-through. if the signal is low on the first brdy or rdy , the cache line is write-through. if the signal is high, the cache line is write-back. if wb/wt is sampled low at reset, all cache line fills are write-through. wb/wt has an internal weak pull-down. w/r write/read (output) a high output indicates a write cycle. a low output in- dicates a read cycle. note: the enhanced am486 microprocessor family does not use the v cc5 pin used by some 3-v, clock- tripled, 486-based processors. the corresponding pin on the enhanced am486 microprocessor is an internal no connect (inc).
enhanced am486 microprocessor amd 18 preliminary 4.3.3 protected mode protected mode provides access to the sophisticated memory management paging and privilege capabilities of the processor. 4.3.4 system management mode smm is a special operating mode described in detail in section 7. 4.4 cache architecture the enhanced am486 microprocessor family supports a superset architecture of the standard 486 cache im- plementation. this architectural enhancement im- proves not only cpu performance, but total system performance. 4.4.1 write-through cache the standard 486dx-type write-through cache architec- ture is characterized by the following: n external read accesses are placed in the cache if they meet proper caching requirements. n subsequent reads to the data in the cache are made if the address is stored in the cache tag array. n write operations to a valid address in the cache are updated in the cache and to external memory. this data writing technique is called write-through . the write-through cache implementation forces all writes to flow through to the external bus and back to main memory. consequently, the write-through cache generates a large amount of bus traffic on the external data bus. 4.4.2 write-back cache the microprocessor write-back cache architecture is characterized by the following: n external read accesses are placed in the cache if they meet proper caching requirements. n subsequent reads to the data in the cache are made if the address is stored in the cache tag array. n write operations to a valid address in the cache that is in the write-through (shared) state is updated in the cache and to external memory. n write operations to a valid address in the cache that is in the write-back (exclusive or modified) state is updated only in the cache. external memory is not updated at the time of the cache update. n modified data is written back to external memory when the modified cache line is being replaced with a new cache line (copy-back operation) or an exter- nal bus master has snooped a modified cache line (write-back). the write-back cache feature significantly reduces the amount of bus traffic on the external bus; however, it also adds complexity to the system design to maintain 4 functional description 4.1 overview enhanced am486 microprocessors use a 32-bit archi- tecture with on-chip memory management and cache memory units. the instruction set includes the complete 486 microprocessor instruction set along with exten- sions to serve the new extended applications. all soft- ware written for the 486 microprocessor and previous members of the x86 architectural family can run on the enhanced am486 microprocessor without modification. the on-chip memory management unit (mmu) is com- pletely compatible with the 486 mmu. the mmu in- cludes a segmentation unit and a paging unit. segmentation allows management of the logical address space by pro- viding easy data and code relocatibility and efficient sharing of global resources. the paging mechanism operates be- neath segmentation and is transparent to the segmentation process. paging is optional and can be disabled by system software. each segment can be divided into one or more 4-kbyte segments. to implement a virtual memory system, the enhanced am486 microprocessor supports full restart- ability for all page and segment faults. 4.2 memory memory is organized into one or more variable length segments, each up to 4 gbytes (2 32 bytes). a segment can have attributes associated with it, including its location, size, type (i.e., stack, code, or data), and protection charac- teristics. each task on a microprocessor can have a maxi- mum of 16,381 segments, each up to 4 gbytes. thus, each task has a maximum of 64 tbytes of virtual memory. the segmentation unit provides four levels of protection for isolating and protecting applications and the operat- ing system from each other. the hardware-enforced protection allows high integrity system designs. 4.3 modes of operation the enhanced am486 microprocessor has four modes of operation: real address mode (real mode), virtual 8086 address mode (virtual mode), protected a ddress mode (protected mode), and system management mode (smm). 4.3.1 real mode in real mode, the enhanced am486 microprocessor operates as a fast 8086. real mode is required primarily to set up the processor for protected mode operation. 4.3.2 virtual mode in virtual mode, the processor appears to be in real mode, but can use the extended memory accessing of protected mode.
enhanced am486 microprocessor amd 19 preliminary memory coherency. the write-back cache requires en- hanced system support because the cache may contain data that is not identical to data in main memory at the same address location. 4.5 write-back cache protocol the enhanced am486 microprocessor family write- back cache coherency protocol reduces bus activity while maintaining data coherency in a multi-master en- vironment. the cache coherency protocol offers the fol- lowing advantages: n no unnecessary bus traffic. the protocol dynamical- ly identifies shared data to the granularity of a cache line. this dynamic identification ensures that the traf- fic on the external bus is the minimum necessary to ensure coherency. n software-transparent. because the protocol gives the appearance of a single unified memory, software does not have to maintain coherency or identify shared data. application software developed for a system without a cache can run without modification. software support is required only in the operating system to identify non-cacheable data regions. a modified mesi protocol is implemented on the en- hanced am486 microprocessor family for systems with write-back cache support. mesi allows cache line to exist in four states: modified, exclusive, shared, and in- valid. the enhanced am486 microprocessor family al- locates memory in the cache due to a read miss. write allocation is not implemented. to maintain coherency between cache and main memory, the mesi protocol has the following characteristics: n the system memory is always updated for the case during a snoop when a modified line is hit. n if a modified line is hit by another master during snooping, the master is forced off the bus and the snooped cache writes back the modified line to the system memory. after the snooped cache completes the write, the backed-off bus master restarts the ac- cess and reads the modified data from memory. 4.5.1 cache line overview to implement the enhanced am486 microprocessor cache coherency protocol, each tag entry is expanded to 2 bits: s1 and s0. each tag entry is associated with a cache line. table 3 shows the cache line organization. table 3. cache line organization data words (32 bits) address tag and status d0 address tag, s1, s0 d1 d2 d3 4.5.2 line status and line state a cache line can occupy one of four legal states as indicated by bits s0 and s1. the line states are shown in table 4. each line in the cache is in one of these states. the state transition is induced either by the pro- cessor or during snooping from an external bus master. 4.5.2.1 invalid an invalid cache line does not contain valid data for any external memory location. an invalid line does not par- ticipate in the c ache coherency protocol. 4.5.2.2 exclusive an exclusive line contains valid data for some external memory location. the data exactly matches the data in the external memory location. 4.5.2.3 shared a shared line contains valid data for an external memory location and the data is shared by another cache and exactly matches the data in the external memory, or indicates that the cache line is in write-through mode. 4.5.2.4 modified a modified line contains valid data for an external mem- ory location. however, the data does not match the data in the external location because the processor has mod- ified the data since it was loaded from the external mem- ory. a cache that contains a modified line is responsible for ensuring that the data is properly maintained. this means that in the case of an external access to that line from another external bus master, the modified line is first written back to the external memory before the other external bus master can complete its access. table 5 shows the mesi cache line states and the correspond- ing availability of data. table 4. legal cache line states s1 s0 line state 0 0 invalid 0 1 exclusive 1 0 modified 1 1 shared
enhanced am486 microprocessor amd 20 preliminary 4.6 cache replacement description the cache line replacement algorithm uses the stan- dard am486 cpu pseudo lru (least-recently used) strategy. when a line must be placed in the internal cache, the microprocessor first checks to see if there is an invalid line available in the set. if no invalid line is available, the lru algorithm replaces the least-recently used cache line in the four-way set with the new cache line. if the cache line for replacement is modified, the modified cache line is placed into the copy-back buffer for copying back to external memory, and the new cache line is placed into the cache. this copy-back ensures that the external memory is updated with the modified data upon replacement. 4.7 memory configuration in computer systems, memory regions require specific caching and memory write methods. for example, some memory regions are non-cacheable while others are cacheable but are write-through. to allow maximum memory configuration, the microprocessor supports specific memory region requirements. all bus masters, such as dma controllers, must reflect all data tra nsfers on the microprocessor local bus so that the micropro- cessor can resp ond appropriately. 4.7.1 cacheability the enhanced am486 cpu caches data based on the state of the cd and nw bits in cr0, in conjunction with the ken signal, at the time of a burst read access from memory. if the wb/wt signal is low during the first brdy , ken meets the standard setup and hold require- ments and the four 32-bit doublewords are still placed in the cache. however, all cacheable accesses in this mode are considered write-through. when the wb/wt is high during the first brdy , the entire four 32-bit dou- bleword transfer considered write-back. note: the cd bit in cr0 enables (0) or disables (1) the internal cache. the nw bit in cr0 enables (0) or dis- ables (1) write-through and snooping cycles. reset sets cd and nw to 1. unlike reset, however, sreset does not invalidate the cache nor does it modify the values of cd and nw in cr0. table 5. mesi cache line status situation modified exclusive shared invalid line valid? ye s ye s ye s no external memory is... out-of- date valid valid status unknown a write to this cache line... does not go to the bus does not go to the bus goes to the bus and updates goes directly to the bus 4.7.2 write-through/write-back if the cpu is operating in write-back mode (i.e., the wb/ wt pin was sampled high at reset), the wb/wt pin indicates whether an individual write access is executed as write-through or write-back. the enhanced am486 micropro- cessor does this on an access-by-access basis. once the cache line is in the cache, the status bit is tested each time the processor writes to the cache line or a tag compare results in a hit during bus watching mode. if the wb/wt signal is low during the first brdy of the cache line read access, the cache line is considered a write-through access. therefore, all writes to this location in the cache are reflected on the ex- ternal bus, even if the cache line is write protected. 4.8 cache functionality in write-back mode the description of cache functional ity in writ e-back mode is divided into two sections: processor-initiated cache functions and snooping actions. 4.8.1 processor-induced actions and state transitions the microprocessor contains two new buffers for use with the mesi protocol support: the copy-back buffer and the write-back buffer. the processor uses the copy- back buffer for cache line replacement of modified lines. the write-back buffer is used when an external bus mas- ter hits a modified line in the cache during a snoop op- eration and the cache line is designated for write-back to main memory. each buffer is four doublewords in size. figure 1 shows a diagram of the state transitions in- duced by the local processor. when a read miss occurs, the line selected for replacement remains in the modi- fied state until overwritten. a copy of the modified line is sent to the copy-back buffer to be written back after replacement. when reload has successfully completed, the line is set either to the exclusive or the shared state, depending on the state of pwt and wb/wt signals. invalid shared modified exclusive read_hit read_miss (wb/wt = 1) (pwt = 0) read_miss [(wb/wt = 0) + (pwt = 1)] write_hit write_hit + read_hit shared read_hit + write_hit figure 1. processor-induced line transitions in write-back mode note: write_hit generates external bus cycle.
enhanced am486 microprocessor amd 21 preliminary if the pwt signal is 0, the external wb/wt signal de- termines the new state of the line. if the wb/wt signal was asserted to 1 during reload, the line transits to the exclusive state. if the wb/wt signal was 0, the line transits to the shared state. if the pwt signal is 1, it overrides the wb/wt signal, forcing the line into the shared state. therefore, if paging is enabled, the soft- ware programmed pwt bit can override the hardware signal wb/wt . until the line is reallocated, a write is the only processor action that can change the state of the line. if the write occurs to a line in the exclusive state, the data is simply written into the cache and the line state is changed to modified. the modified state indicates that the contents of the line require copy-back to the main memory before the line is reallocated. if the write occurs to a line in the shared state, the cache performs a write of the data on the external bus to up- date the external memory. the line remains in the shared state until it is replaced with a new cache line or until it is flushed. in the modified state, the processor continues to write the line without any further external actions or state transitions. if the pwt or pcd bits are changed for a specified mem- ory location, the tag bits in the cache are assumed to be correct. to avoid memory inconsistencies with re- spect to cacheability and write status, a cache copy- back and invalidation should be invoked either by using the wbinvd instruction or asserting the flush signal. 4.8.2 snooping actions and state transitions to maintain cache coherency, the cpu must allow snooping by the current bus master. the bus master initiates a snoop cycle to check whether an a ddress is cached in the internal cache of the microprocessor. a snoop cycle differs from any other cycle in that it is ini- tiated externally to the microprocessor, and the signal for beginning the cycle is eads instead of ads . the address bus of the microprocessor is bidirectional to allow the address of the snoop to be driven by the sys- tem. a snoop access can begin during any hold state: n while hold and hlda are asserted n while boff is asserted n while ahold is asserted in the clock in which eads is asserted, the microprocessor samples the inv input to qualify the type of inquiry. inv spec- ifies whether the line (if found) must be invalidated (i.e., the mesi status changes to invalid or i). a line is inval- idated if the snoop access was genera ted due to a write of another bus master. this is indicated by inv set to 1. in the case of a read, the line does not have to be inval- idated, which is indicated by inv set to 0. the core system logic can generate eads by watching the ads from the current bus master, and inv by watching the w/r signal. the microprocessor compares the ad- dress of the snoop requ est with addresses of lines in the cache and of any line in the copy-back buffer waiting to be transferred on the bus. it does not, however, com- pare with the address of write-miss data in the write buffers. two clock cycles after sampling eads , the mi- croprocessor drives the results of the snoop on the hitm pin. if hitm is active, the line was found in the modified state; if inactive, the line was in the exclusive or shared state, or was not found. figure 2 shows a diagram of the state transitions in- duced by snooping accesses. 4.8.2.1 difference between snooping access cases snooping accesses are external accesses to the micro- processor. as described earlier, the snooping logic has a set of signals independent from the processor-related signals. those signals are: n eads n inv n hitm in addition to these signals, the address bus is required as an input. this is achieved by setting ahold, hold, or boff active. snooping can occur in parallel with a processor-initiated access that has already been started. the two accesses depend on each other only when a modified line is writ- ten back. in this case, the snoop requires the use of the cycle control signals and the data bus. the following sec- tions describe the scenarios for the hold, ahold, and boff implementations. figure 2. snooping state transitions invalid modified exclusive shared (hitm asserted + write-back) (eads = 0 * inv = 1) + flush = 0 (eads = 0 * inv = 1) + flush = 0 eads = 0 * inv = 0 * flush = 1 eads = 0 * inv = 0 * flush = 1 (hitm asserted + write-back) eads = 0 * inv = 0 * flush = 1 eads = 0 * inv = 1 + flush = 0
enhanced am486 microprocessor amd 22 preliminary 4.8.2.2 hold bus arbitration implementation the hold/hlda bus arbitration scheme is used prima- rily in systems where all memory transfers are seen by the microprocessor. the hold/hlda bus arbitration scheme permits simple write-back cache design while maintaining a relatively high performing system. figure 3 shows a typical system block diagram for hold/ hlda bus arbitration. note: to maintain proper system timing, the hold signal must remain active for one clock cycle after hitm transitions active. deassertion of hold in the same clock cycle as hitm assertion may lead to unpredictable processor behavior. 4.8.2.2.1 processor-induced bus cycles in the following scenarios, read accesses are assumed to be cache line fills. the cases also assume that the core system logic does not return brdy or rdy until hitm is sampled. the addition of wait states follows the standard 486 bus protocol. for demonstration purpos- es, only the zero wait state approach is shown. table 6 explains the key to switching waveforms. cpu l2 cache dram local bus peripheral i/o bus interface slow peripheral address bus data bus address bus data bus figure 3. typical system block diagram for hold/hlda bus arbitration 4.8.2.2.2 external read scenario : the data resides in external memory (see figure 4). step 1 the processor starts the external read access by asserting ads = 0 and w/r = 0. step 2 wb/wt is sampled in the same cycle as brdy . if wb/wt = 1, the data resides in a write-back cache- able memory location. step 3 the processor completes its burst read and as- serts blast . 4.8.2.2.3 external write scenario: the data is written to the external memory (see figure 5). step 1 the processor starts the external write access by asserting ads = 0 and w/r = 1. step 2 the processor completes its write to the core system logic. 4.8.2.2.4 hold/hlda external access timing in systems with two or more bus masters, each bus master is equipped with individual hold and hlda control signals. these signals are then centralized to the core system logic that controls individual bus mas- ters, depending on bus request signals and the hitm signal. table 6. key to switching waveforms waveform inputs outputs must be steady will be steady may change from h to l will change from h to l may change from l to h will change from l to h dont care; any change permitted changing; state un known does not apply center line is high-impedance off state
23 amd preliminary enhanced am486 microprocessor boff wb/wt ken data n n+8 n+4 blast brdy ads 1 adr m/io w/r clk 2 n n+8 n+4 n+12 3 n+12 note: the circled numbers in this figure represent the steps in section 4.8.2.2.2. figure 4. external read boff wb/wt data n ads blast brdy m / io w / r adr clk n note: the circled numbers in this figure represent the steps in section 4.8.2.2.3. 1 2 figure 5. external write
amd 24 preliminary enhanced am486 microprocessor hlda eads hold hitm adr inv clk valid valid figure 6. snoop of on-chip cache that does not hit a line note: the circled numbers in this figure represent the steps in section 4.8.3.1. a hlda hold hitm eads inv adr clk note: the circled numbers in this figure represent the steps in section 4.8.3.2. figure 7. snoop of on-chip cache that hits a non-modified line valid valid a
amd 25 preliminary enhanced am486 microprocessor 4.8.3 external bus master snooping actions the following scenarios describe the snooping actions of an external bus master. 4.8.3.1 snoop miss scenario : a snoop of the on-chip cache does not hit a line, as shown in figure 6. step 1 the microprocessor is placed in snooping mode with hold. hlda must be high for a minimum of one clock cycle before eads as- sertion. in the fastest case, this means that hold was asserted one clock cycle before the hlda response. step 2 eads and inv are applied to the microprocessor. if inv is 0, a read access caused the snooping cycle. if inv is 1, a write access caused the snooping cycle. step 3 two clock cycles after eads was asserted, the snooping signal hitm becomes valid. because the addressed line is not in the snooping cache, hitm is 1. 4.8.3.2 snoop hit to a non-modified line scenario : the snoop of the on-chip cache hits a line, and the line is not modified (see figure 7). step 1 the microprocessor is placed in snooping mode with hold. hlda must be high for a minimum of one clock cycle before eads as- sertion. in the fastest case, this means that hold was asserted one clock cycle before the hlda response. step 2 eads and inv are applied to the microprocessor. if inv is 0, a read access caused the snooping cycle. if inv is 1, a write access caused the snooping cycle. step 3 two clock cycles after eads is asserted, hitm becomes valid. in this case, hitm is 1. 4.8.4 write-back case scenario : write-back accesses are always burst writes with a length of four 32-bit words. for burst writes, the burst always starts with the microprocessor line offset at 0. hold must be deasserted before the write-back can be performed (see fig- ure 8). step 1 hold places the microprocessor in snooping mode. hlda must be high for a minimum of one clock cycle before eads assertion. in the fastest case, this means that hold asserts one clock cy- cle before the hlda response. step 2 eads and inv are asserted. if inv is 0, snooping is caused by a read access. if inv is 1, snooping is caused by a write access. eads is not sampled again until after the modified line is written back to memory. it is detected again as early as in step 11. eads external bus masters boff signal hlda data hold hitm ads inv brdy blast w/r m/io adr clk valid n n n n+4 n+8 n+12 n+1 valid n figure 8. snoop that hits a modified line (write-back) note: the circled numbers in this figure represent the steps in section 4.8.4. 2 3 1 7 8 9 10 6 5 11 floating/three-stated cache floating/three-stated 4 n+8 n+4
amd 26 enhanced am486 microprocessor step 3 two clock cycles after eads is asserted, hitm becomes valid, and is 0 because the line is modi- fied. step 4 the core system logic deasserts, in the next clock, the hold signal in response to the hitm = 0 signal. the core system logic backs off the current bus master at the same time so that the micropro- cessor can access the bus. hold can be reassert- ed immediately after ads is asserted for burst cycles. step 5 the snooping cache starts its write-back of the modified line by asserting ads = 0, cache = 0, and w/r = 1. the write access is a burst write. the number of clock cycles between deasserting hold to the snooping cache and first asserting ads for the write-back cycles can vary. in this example, it is one clock cycle, which is the shortest possible time. regardless of the number of clock cycles, the start of the write-back is seen by ads going low. step 6 the write-back access is finished when blast and brdy both are 0. step 7 in the clock cycle after the final write-back ac- cess, the processor drives hitm back to 1. step 8 hold is sampled by the microprocessor. step 9 one cycle after sampling hold high, the mi- croprocessor transitions hlda transitions to 1, acknowledging the hold request. step 10 the core system logic removes hold-off control to the external bus master. this allows the ex- ternal bus master to immediately retry the abort- ed access. ads is strobed low, which generates eads low in the same clock cycle. step 11 the bus master restarts the aborted access. eads and inv are applied to the microprocessor as before. this starts another snoop cycle. the status of the addressed line is now either shared (inv = 0) or is changed to invalid (inv = 1). 4.8.5 write-back and pending access scenario : the following occurs when, in addition to the write- back operation, other bus accesses initiated by the processor associated with the snooped cache are pending. the micro- processor gives the write-back access priority. this implies that if hold is deasserted, the microprocessor first writes back the modified line (see figure 9). figure 9. write-back and pending access note: the circled numbers in this figure represent the steps in section 4.8.5. eads external bus masters boff signal hlda data hold hitm ads inv brdy blast w/r m/io adr clk valid n n n n+4 n+8 n+12 n+12 valid n 2 3 1 7 8 9 10 6 5 11 floating/three-stated cache 4 n+8 n+4
amd 27 preliminary enhanced am486 microprocessor step 1 hold places the microprocessor in snooping mode. hlda must be high for a minimum of one clock cycle before eads assertion. in the fastest case, this means that hold asserts one clock cy- cle before the hlda response. step 2 eads and inv are asserted. if inv is 0, snooping is caused by a read access. if inv is 1, snooping is caused by a write access. eads is not sampled again until after the modified line is written back to memory. it is detected again as early as in step 11. step 3 two clock cycles after eads is asserted, hitm becomes valid, and is 0 because the line is modi- fied. step 4 in the next clock the core system logic deas- serts the hold signal in response to the hitm = 0. the core system logic backs off the current bus master at the same time so that the microprocessor can access the bus. hold can be reasserted im- mediately after ads is asserted for burst cycles. step 5 the snooping cache starts its write-back of the modified line by asserting ads = 0, cache = 0, and w/r = 1. the write access is a burst write. the number of clock cycles between deasserting hold to the snooping cache and first asserting ads for the write-back cycles can vary. in this example, it is one clock cycle, which is the shortest possible time. regardless of the number of clock cycles, the start of the write-back is seen by ads going low. step 6 the write-back access is finished when blast and brdy both are 0. step 7 in the clock cycle after the final write-back ac- cess, the processor drives hitm back to 1. step 8 hold is sampled by the microprocessor. step 9 a minimum of 1 clock cycle after the completion of the pending access, hlda transitions to 1, acknowledging the hold request. step 10 the core system logic removes hold-off control to the external bus master. this allows the ex- ternal bus master to immediately retry the abort- ed access. ads is strobed low, which generates eads low in the same clock cycle. step 11 the bus master restarts the aborted access. eads and inv are applied to the microprocessor as before. this starts another snoop cycle. the status of the addressed line is now either shared (inv = 0) or is changed to invalid (inv = 1). 4.8.5.1 hold/hlda write-back design considerations when designing a write-back cache system that uses hold/hlda as the bus arbitration method, the follow- ing considerations must be observed to ensure proper operation (see figure 10). hlda clk ads blast brdy hold valid hold assertion figure 10. valid hold assertion during write-back hitm
enhanced am486 microprocessor amd 28 preliminary step 1 during a snoop to the on-chip cache that hits a modified cache line, the hold signal cannot be deasserted to the microprocessor until the next clock cycle after hitm transitions active. step 2 after the write-back has commenced, the hold signal should be asserted no earlier than the next clock cycle after ads goes active, and no later than in the final brdy of the last write. asserting hold later than the final brdy may allow the microprocessor to permit a pending access to begin. step 3 if rdy is returned instead of brdy during a write-back, the hold signal can be reasserted at any time starting one clock after ads goes active in the first transfer up to the final transfer when rdy is asserted. asserting rdy instead of brdy will not break the write-back cycle if hold is asserted. the processor ignores hold until the final write cycle of the write- back. 4.8.5.2 ahold bus arbitration implementation the use of ahold as the control mechanism is often found in systems where an external second-level cache is closely coupled to the microprocessor. this tight cou- pling allows the microprocessor to operate with the least amount of stalling from external snooping of the on-chip cache. additionally, snooping of the cache can be per- formed concurrently with an access by the microproces- sor. this feature further improves the performance of the total system (see figure 11). note: to maintain proper system timing, the ahold signal must remain active for one clock cycle after hitm transitions active. deassertion of ahold in the same clock cycle as hitm assertion may lead to unpredictable processor behavior. dram address bus data bus l2 cache address bus data bus i/o bus interface slow peripheral cpu address bus data bus figure 11. closely coupled cache block diagram the following sections describe the snooping scenarios for the ahold implementation. 4.8.5.3 normal write-back scenario : this scenario assumes that a processor-initiated access has already started and that the external logic can finish that access even without the address being applied after the first clock cycle. therefore, a snooping access with ahold can be done in parallel. in this case, the processor- initiated access is finished first, then the write-back is executed (see figure 12). the sequence is as follows: step 1 the processor initiates an external, simple, non-cacheable read access, strobing ads = 0 and w/r = 0. the address is driven from the cpu. step 2 in the same cycle, ahold is asserted to indi- cate the start of snooping. the address bus floats and becomes an input in the next clock cycle. step 3 during the next clock cycles, the brdy or rdy signal is not strobed low. therefore, the proces- sor-initiated access is not finished. step 4 two clock cycles after ahold is asserted, the eads signal is activated to start an actual snoop- ing cycle, and inv is valid. if inv is 0, a read access caused the snooping cycle. if inv is 1, a write ac- cess caused the snooping cycle. additional eads are ignored due to the hit of a modified line. it is detected after hitm goes inactive. step 5 two clock cycles after eads is asserted, the snooping signal hitm becomes valid. the line is modified; therefore, hitm is 0. step 6 in this cycle, the processor-initiated access is finished. step 7 two clock cycles after the end of the processor- initiated access, the cache immediately starts writing back the modified line. this is indicated by ads = 0 and w/r = 1. note that ahold is still active and the address bus is still an input. however, the write-back access can be execut- ed without any address. this is because the corresponding address must have been on the bus when eads was strobed. therefore, in the case of the core system logic, the address for the write-back must be latched with eads to be available later. this is required only if ahold is not removed if hitm becomes 0. otherwise, the address of the write-back is put onto the address bus by the microprocessor.
enhanced am486 microprocessor amd 29 preliminary step 8 as an example, ahold is now removed. in the next clock cycle, the current address of the write-back access is driven onto the address bus. step 9 the write-back access is finished when blast and brdy both transition to 0. step 10 in the clock cycle after the final write-back access, the snooping cache drives hitm back to 1. the status of the snooped and written-back line is now either shared (inv = 0) or is changed to invalid (inv = 1). 4.8.6 reordering of write-backs (ahold) with boff as seen previously, the bus interface unit (biu) com- pletes the processor-initiated access first if the snooping access occurs after the st art of the processor-initiated access. if the hitm signal occurs one clock cycle before the ads = 0 of the processor-initiated access, the write-back receives priority and is executed first. however, if the snooping access is executed after the start of the processor-initiated access, there is a meth- odology to reorder the access order. the boff signal delays outstanding processor-initiated cycles so that a snoop write-back can occur immediately (see figure 13). scenario : if there are outstanding processor-initiated cy- cles on the bus, asserting boff clears the bus pipeline. if a snoop causes hitm to be asserted, the first cycle issued by the microprocessor after deassertion of boff is the write-back cycle. after the write-back cycle, it reissues the aborted cycles. this translates into the following sequence: step 1 the processor starts a cacheable burst read cycle. step 2 one clock cycle later, ahold is asserted. this switches the address bus into an input one clock cycle after ahold is asserted. step 3 two clock cycles after ahold is asserted, the eads and inv signals are asserted to start the snooping cycle. step 4 two clock cycles after eads is asserted, hitm becomes valid. the line is modified, therefore hitm = 0. step 5 note that the processor-initiated access is not completed because blast = 1. step 6 with hitm going low, the core system logic as- serts boff in the next clock cycle to the snooping processor to reorder the access. boff overrides brdy . therefore, the partial read is not used. it is reread later. step 7 one clock cycle later boff is deasserted. the write-back access starts one clock cycle later be- cause the boff has cleared the bus pipeline. data hitm eads inv read brdy ahold blast ads w/r m/io adr clk w n+4 w n w n+8 w n+c figure 12. snoop hit cycle with write-back note: the circled numbers in this figure represent the steps in section 4.8.5.3. 1 7 8 9 5 4 6 3 2 cache from cpu to cpu from cpu
enhanced am486 microprocessor amd 30 preliminary step 8 ahold is deasserted. in the next clock cycle the address for the write-back is driven on the address bus. step 9 one cycle after boff is deasserted, the cache immediately starts writing back the modified line. this is indicated by ads = 0 and w/r = 1. step 10 the write-back access is finished when blast and brdy go active 0. step 11 the biu restarts the aborted cache line fill with the previous read. this is indicated by ads = 0 and w/r = 0. step 12 in the same clock cycle, the snooping cache drives hitm back to 1. step 13 the previous read is now reread. 4.8.7 special scenarios for ahold snooping in addition to the previously described scenarios, there are special scenarios regarding the time of the eads and ahold assertion. the final result depends on the time eads and ahold are asserted relative to other proces- sor-initiated operations. 4.8.7.1 write cycle reordering due to buffering scenario : the mesi cache protocol and the ability to perform and respond to snoop cycles guarantee that writes to the cache are logically equivalent to writes to memory. in particu- lar, the order of read and write operations on cached data is the same as if the operations were on data in memory. even non-cached memory read and write requests usually occur on the external bus in the same order that they were issued in the program. for example, when a write miss is followed by a read miss, the write data goes on the bus before the read request is put on the bus. however, the posting of writes in write buffers coupled with snooping cycles may cause the or- der of writes seen on the external bus to differ from the order they appear in the program. consider the following example, which is illustrated in figure 14. for simplicity, snooping sig- nals that behave in their usual manner are not shown. step 1 ahold is assert ed. no further processor-initi- ated accesses to the external bus can be start- ed. no other access is in progress. step 2 the processor writes data a to the cache, re- sulting in a write miss. therefore, the data is put into the write buffers, assuming they are not full. no external access can be started because ahold is still 1. r2 boff data hitm eads inv ahold r1 brdy blast ads w/r m/io adr clk w1 to cpu dont care w1 w2 w3 w4 w1 from cpu w3 w4 figure 13. cycle reordering with boff (write-back) note: the circled numbers in this figure represent the steps in section 4.8.6. w2 11 12 r2 from cpu ? ? ? ? cache r1 from cpu ? a
enhanced am486 microprocessor amd 31 preliminary step 3 the next write of the processor hits the cache and the line is non-shared. therefore, data b is written into the cache. the cache line transits to the modified state. step 4 in the same clock cycle, a snoop request to the same address where data b resides is started be- cause eads = 0. the snoop hits a modified line. eads is ignored due to the hit of a modified line, but is detected again as early as in step 10. step 5 two clock cycles after eads asserts, hitm be- comes valid. step 6 because the processor-initiated access cannot be finished (ahold is still 1), the biu gives priority to a write-back access that does not re- quire the use of the address bus. therefore, in the clock cycle, the cache starts the write-back sequence indicated by ads = 0 and w/r = 0. step 7 during the write-back sequence, ahold is deasserted. step 8 the write-back access is finished when blast and brdy transition to 0. step 9 after the last write-back access, the biu starts writing data a from the write buffers. this is in- dicated by ads = 0 and w/r = 0. step 10 in the same clock cycle, the snooping cache drives hitm back to 1. step 11 the write of data a is finished if brdy transitions to 0 (blast = 0), because it is a single word. the software write sequence was first data a and then data b. but on the external bus the data appear first as data b and then data a. the order of writes is changed. in most cases it is unnecessary to strictly m aintain the ordering of writes. however, some cases (for example, writing to hardware control registers) require writes to be observed externally in the same order as pro- grammed. there are two options to ensure serialization of writes, both of which drive the cache to write-through mode: 1) set the pwt bit in the page table entries. 2) drive the wb/wt signal low when accessing these memory locations. option 1 is an operating-system level solution not di- rectly implemented by user-level code. option 2, the hardware solution, is implemented at the system level. blast data brdy eads ads hitm cached data ahold clk write buffer b original 1 a 2 6 5 b modified 4 3 b b+4 b+8 b+12 8 a ignored 9 7 xxx note: the circled numbers in this figure represent the steps in section 4.8.7.1. figure 14. write reordering due to buffering 10 11
enhanced am486 microprocessor amd 32 preliminary 4.8.7.2 boff write-back arbitration implementation the use of boff to perform snooping of the on-chip cache is used in systems where more than one cache- able bus master resides on the microprocessor bus. the boff signal forces the microprocessor to relinquish the bus in the following clock cycle, regardless of the type of bus cycle it was performing at the time. consequently, the use of boff as a bus arbitrator should be imple- mented with care to avoid system problems. 4.8.8 boff design considerations the use of boff as a bus arbitration control mecha- nism is immediate. boff forces the microprocessor to abort an access in the following clock cycle after it is asserted. the following design issues must be consid- ered. 4.8.8.1 cache line fills the microprocessor aborts a cache line f ill during a burst read if boff is asserted during the access. upon regaining the bus, the read access commences where it left off when boff was recognized. external buffers should take this cycle continuation into consideration if boff is allowed to abort burst read cycles. 4.8.8.2 cache line copy-backs similar to the burst read, the burst write also can be aborted at any time with the boff signal. upon regain- ing access to the bus, the write continues from where it was aborted. external buffers and control logic should take into consideration the necessary control, if any, for burst write continuations. 4.8.8.3 locked accesses locked bus cycles occur in various forms. locked ac- cesses occur during read-mod ify-write operations, in- terrupt acknowledges, and page table updates. although asserting boff during a locked cycle is per- mitted, extreme care should be taken to ensure data coherency for semaphore updates and proper data or- dering. 4.8.9 boff during write-back if boff is asserted during a write-back, the processor per- forming the write-back goes off the bus in the next clock cycle. if boff is released, the processor restarts that write-back ac- cess from the point at which it was aborted. the behavior is identical to the normal boff case that includes the abort and restart behavior. 4.8.10 snooping characteristics during a cache line fill the following cases apply if snooping is invoked via ahold, and neither hold nor boff is asserted. it also requires that the processor buses are not tied together, such as in a second-level cache system. the microprocessor takes responsibility for responding to snoop cycles for a cache line only during the time that the line is actually in the cache or in a copy-back buffer. there are times during the cache line fill cycle and during the cache replacement cycle when the line is in transit and snooping responsibility must be taken by other system components. system designers should consider the possibility that a snooping cycle may arrive at the same time as a cache line fill or repl acement for the same address. if a snoop- ing cycle arrives at the same time as a cache line fill with the same address, the cpu uses the cache line fill, but does not place it in the cache. if a snooping cycle occurs at the same time as a cache line fill with a different address, the cache line fill is placed into the cache unless eads is recognized before the first brdy but after ads is asserted, or eads is recognized on the last brdy of the cache line fill. in these cases, the line is not placed into the cache. 4.8.11 snooping characteristics during a copy-back if a copy-back is occurring because of a cache line re- placement, the address being replaced can be matched by a snoop until assertion of the last brdy of the copy- back. this is when the modified line resides in the copy-back buffer. an eads as late as two clocks before the last brdy can cause hitm to be asserted. figure 15 illustrates the microprocessor relinquishing responsibility of recognizing snoops for a line that is copied back. it shows the latest eads assertion that can cause hitm assertion. hitm remains active for only one clock period in that example. hitm remains active through the last brdy of the corresponding write-back; in that case the write- back has already completed. this is the latest point where snooping can start, because two clock cycles later the final brdy of the write-back is applied. if a snoop cycle hits the copy-back address after the first brdy of the copy-back and ads has been issued, the microprocessor asserts hitm . keep in mind that the write- back was initiated due to a read miss and not due to a snoop to a modified line. in the second case, no snooping is recog- nized if a modified line is detected.
enhanced am486 microprocessor amd 33 preliminary 4.9 cache invalidation and flushing in write-back mode the enhanced am486 microprocessor family supports cache invalidation and flushing, much like the am486dx and am486 microprocessor write-through mode. however, the addition of the write-back cache adds some complexity. 4.9.1 cache invalidation through software the enhanced am486 microprocessor family uses the same instructions as the am486dx and am486 micro- processor families to invalidate the on-chip cache. the two invalidation instructions, invd and wbinvd, while similar, are slightly different for use in the write-back environment. the wbinvd instruction first performs a write-back of the modified data in the cache to external memory. then it invalidates the cache, followed by two special bus cycles, whereas the invd instruction only invalidates the cache, regardless of whether modified data exists, and follows with a special bus cycle. the utmost care should be taken when executing the invd instruction to ensure memory coherency. otherwise, modified data may be invalidated prior to writing back to main memory. in write-back mode, wbinvd requires a minimum of 2050 internal clocks to search the cache for modified data. writing back modified data adds to this minimum time. wbinvd can only be stopped by a reset. two special bus cycles follow the write-back of modified data upon execution of the wbinvd instruction: first the write-back, and then the flush special bus cycle. the invd operates identically to the standard 486 micropro- cessor family in that the flush special bus cycle is gen- erated when the on-chip cache is invalidated. table 7 specifies the special bus cycle states for the instructions wbinvd and invd. 4.9.2 cache invalidation through hardware the other mechanism for cache invalidation is the flush pin. the flush pin operates similarly to the wbinvd command, writing back modified cache lines to main memory. after the entire cache has copied back all the modified data, the microprocessor generates two special bus cycles. these special bus cycles signal to the external caches that the microprocessor on-chip cache has completed its copy-back and that the second level cache may begin its copy-back to memory, if so required. two flush acknowledge cycles are generated after the flush pin is asserted and the modified data in the cache is written back. as with the wbinvd instruction, in write-back mode, a flush requires a minimum of 2050 internal clocks to test the cache for modified data. writ- ing back modified data adds to this minimum time. the flush operation can only be stopped by a reset. table 8 shows the special flush bus cycle configuration. table 7. wbinvd/invd special bus cycles a32Ca2 m/io d/c w/r be3 be2 be1 be0 bus cycle 0000 0000 h 0 0 1 0 1 1 1 write-back 1 0000 0000 h 0 0 1 1 1 0 1 flush 1,2 notes: 1. wbinvd generates first write-back, then flush. 2. invd generates only flush. brdy blast ads hitm eads ahold adr clk n s figure 15. latest snooping of copy-back cache address b
enhanced am486 microprocessor amd 34 preliminary 4.9.3 snooping during cache flushing as with snooping during normal operation, snooping is permitted during a cache flush, whether initiated by the flush pin or wbinvd instruction. after completion of the snoop, and write-back, if needed, the microproces- sor completes the copy-back of modified cache lines. 4.10 burst write the enhanced am486 microprocessor improves sys- tem performance by implementing a burst write feature table 8. flush special bus cycles a32Ca2 m/io d/c w/r be3 be2 be1 be0 bus cycle 0000 0001h 0 0 1 0 1 1 1 first flush acknowledge 0000 0001h 0 0 1 1 1 0 1 second flush acknowledge for cache line write-backs and copy-backs. standard write operations are still supported because they are on the am486dx family of microprocessors. burst writes are always four 32-bit words and start at the beginning of a cache line address of 0 for the st arting access. the timing of the blast and brdy signals is identical to the burst read. figure 16 shows a burst write access. (see figure 17 and figure 18 for burst read and burst write access with boff asserted.) in addition to using blast , the cache signal indicates burstable cycles. cache is a cycle definition pin used when in write-back mode (cache floats in write-through mode). for pro- cessor-initiated cycles, the signal indicates either n for a read cycle, the internal cachea bility of the cycle n for a write cycle, a burst write-back or copy-back, if ken is asserted (for linefills). w/r brdy ads blast adr m/io xx4 data xx0 xx0 xx4 xx8 xxc figure 16. burst write clk cache xxc xx8 data to cpu brdy boff xx0 xx4 dont care adr xx0 ads blast m/io w/r clk xx4 xx4 xx8 xxc xx4 xx8 xxc figure 17. burst read with boff assertion cache
enhanced am486 microprocessor amd 35 preliminary data from cpu brdy boff xx4 adr ads blast m/io w/r clk figure 18. burst write with boff assertion cache xx0 xx4 xx4 xx8 xxc xx0 xx8 xxc xx4 cache is asserted for cacheable reads, cacheable code fetches, and write-backs/copy-backs. cache is deasserted for non-cacheable reads, translation looka- side buffer (tlb) replacements, locked cycles (except for write-back cycles generated by an external snoop operation that interrupts a locked read/modify/write se- quence), i/o cycles, special cycles, and write-throughs. cache is driven to its valid level in the same clock as the assertion of ads and remains valid until the next rdy or brdy assertion. the cache output pin floats one clock after boff is asserted. additionally, the signal floats when hlda is asserted. the following steps describe the burst write sequence: 1) the access is started by asserting: ads = 0, m/io = 1, w/r = 1, cache = 0. the address offset always is 0, so the burst write always starts on a cache line boundary. cache transitions high (inactive) after the first brdy . 2) in the second clock cycle, blast is 1 to indicate that the burst is not finished. 3) the burst write access is finished when blast is 0 and brdy is 0. when the rdy signal is returned instead of the brdy signal, the enhanced am486 microprocessor halts the burst cycle and proceeds with the standard non-burst cycle. 4.10.1 locked accesses locked accesses of an enhanced am486 microproces- sor occur for read-modify-write operations and inter- rupt acknowledge cycles. the timing is identical to the dx microprocessor, although the state transitions differ from the standard dx microprocessor. unlike processor-initiat- ed accesses, state transitions for locked accesses are seen by all processors in the system. any locked read or write generates an external bus cycle, regardless of cache hit or miss. during locked cycles, the processor does not recognize a hold request, but it does recog- nize boff and ahold requests. locked read operations al ways read data from the ex- ternal memory, regardless of whether the data is in the cache. in the event that the data is in the cache and unmodified, the cache line is invalidated and an external read operation is performed. the data from the external memory is used instead of the data in the cache, thus ensuring that the locked read is seen by all other bus masters. if a locked read occurs, the data is in the cache, and it is modified, the microprocessor first copies back the data to external memory, invalidates the cache line, and then performs a read operation to the same loca- tion, thus ensuring that the locked read is seen by all other bus masters. at no time is the data in the cache used directly by the microprocessor or a locked read operation before reading the data from external memo- ry. since locked cycles always begin with a locked read access, and locked read cycles always invalidate a cache line, a locked write cycle to a valid cache line, either modified or unmodified, does not occur. 4.10.2 serialization locked accesses are totally serialized: n all reads and writes in the write buffer that precede the locked access are issued on the bus before the first locked access is executed. n no read or write after the last locked access is issued internally or on the bus until the final rdy or brdy for all locked accesses. n it is possible to get a locked read, write-back, locked write cycle.
enhanced am486 microprocessor amd 36 preliminary 4.10.3 plock operation in write-through mode as described in section 3, plock is only used in write- through mode; the signal is driven inactive in writ e-back mode. in write-through mode, the processor drives plock low to indicate that the current bus transaction requires more than one bus cycle. the cpu continues to drive the signal low until the transaction is completed, whether or not rdy or brdy is returned. refer to the pin description for additional information. 5 clock control 5.1 clock generation the enhanced am486 cpu is driven by a 1x clock that relies on phased-lock loop (pll) to generate the two internal clock phases: phase one and phase two. the rising edge of clk corresponds to the start of phase one (ph1). all external timing parameters are specified relative to the rising edge of clk. 5.2 stop clock the enhanced am486 cpu also provides an interrupt mechanism, stpclk , that allows system hardware to con- trol the power consumption of the cpu by stopping the internal clock to the cpu core in a sequenced manner. the first low- power state is called the stop grant state. if the clk input is completely stopped, the cpu enters into the stop clock state (the lowest power state). when the cpu recognizes a stp - clk interrupt, the processor: n stops execution on the next instruction boundary (unless superseded by a higher priority interrupt). n waits for completion of cache flush. n stops the pre-fetch unit. n empties all internal pipelines and write buffers. n generates a stop grant bus cycle. n stops the internal clock. at this point the cpu is in the stop grant state. the cpu cannot respond to a stpclk request from an hlda state because it cannot empty the write buffers and, therefore, cannot generate a stop grant cycle. the rising edge of stpclk signals the cpu to return to program exe- cution at the instruction following the interrupted instruction. unlike the normal interrupts (intr and nmi), stpclk does not initiate interrupt acknowledge cycles or interrupt table reads. 5.2.1 external interrupts in order of priority in write-through mode, the priority order of external in- terrupts is: 1) reset/sreset 2) flush 3) smi 4) nmi 5) intr 6) stpclk in write-back mode, the priority order of external inter- rupts is: 1) reset 2) flush 3) sreset 4) smi 5) nmi 6) intr 7) stpclk stpclk is active low and has an internal pull-up re- sistor. stpclk is asynchronous, but setup and hold times must be met to ensure recognition in any specific clock. stpclk must remain active until the stop grant special bus cycle is asserted and the system responds with either rdy or brdy . when the cpu enters the stop grant state, the internal pull-up resistor is disabled, reducing the cpu power consumption. the stpclk input must be driven high (not floated) to exit the stop grant state. stpclk must be deasserted for a mini- mum of five clocks after rdy or brdy is returned active for the stop grant bus cycle before being asserted again. there are two regions for the low-power mode supply current: 1) low power: stop grant state (fast wake-up, frequency- and voltage-dependent), 2) lowest power: stop clock state (slow wake-up, voltage- dependent). 5.3 stop grant bus cycle the processor drives a special stop grant bus cycle to the bus after recognizing the stpclk interrupt. this bus cycle is the same as the halt cycle used by a standard am486 microprocessor, with the exception that the stop grant bus cycle drives the value 0000 0010h on the address pins. n m/lo = 0 n d/c = 0 n w/r =1 n address bus = 0000 0010h (a 4 = 1) n be3 Cbe0 = 1011 n data bus = undefined the system hardware must acknowledge this cycle by re- turning rdy or brdy , or the processor will not enter the stop grant state (see figure 19). the latency be tween a stpclk request and the stop grant bus cycle depends on the current instruction, the amount of data in the cpu write buffers, and the system memory performance
enhanced am486 microprocessor amd 37 preliminary 5.4 pin state during stop grant table 9 shows the pin states during stop grant bus states. during the stop grant state, most output and input/output signals of the microprocessor maintain the level they held when entering the stop grant state. the data and data parity signals are three-stated. in response to hold being driven active during the stop grant state (when the clk input is running), the cpu generates hlda and three-states all output and input/output signals that are three- stated during the hold/hlda state. after hold is deassert- ed, all signals return to the same state they were before the hold/hlda sequence. table 9. pin state during stop grant bus state signal type state a3Ca2 o previous state a31Ca4 i/o previous state d31Cd0 i/o floated be3 Cbe0 o previous state dp3Cdp0 i/o floated w/r , d/c , m/io , cache o previous state ads o inactive lock , plock o inactive breq o previous state hlda o as per hold blast o previous state ferr o previous state pchk o previous state smiact o previous state hitm o previous state to achieve the lowest possible power consumption dur- ing the stop grant state, the system designer must en- sure the input signals with pull-up resistors are not driven low, and the input signals with pull-down resis- tors are not driven high. all inputs except data bus pins must be driven to the power supply rails to ensure the lowest possible current consumption during stop grant or stop clock modes. for compatibility, data pins must be driven low to achieve the lowest possible power consumption. 5.5 clock control state diagram figure 20 shows the state transitions during a stop clock cycle. 5.5.1 normal state this is the normal operating state of the cpu. while in the normal state, the clk input can be dynamically changed within the specified clk period stability limits. 5.5.2 stop grant state the stop grant state provides a low-power state that can be entered by simply asserting the external stpclk interrupt pin. when the stop grant bus cycle has been placed on the bus, and either rdy or brdy is returned, the cpu is in this state. the cpu returns to the normal execution state 10C20 clock periods after stpclk has been deasserted. while in the stop grant state, the pull-up resistors on stpclk and up are disabled internally. the system must continue to drive these inputs to the state they were in imme- diately before the cpu entered the stop grant state. for min- imum cpu power consumption, all other input pins should be driven to their inactive level while the cpu is in the stop grant state. . t 20 t 21 figure 19. entering stop grant state rdy addr stpclk clk stop grant bus cycle
enhanced am486 microprocessor amd 38 preliminary figure 20. stop clock state machine (valid for write-back mode only) figure 21. recognition of inputs when exiting stop grant state t 20 t 21 clk stpclk nmi smi a stpclk sampled note: a = earliest time at which nmi or smi is recognized. a reset or sreset brings the cpu from the stop grant state to the normal state. the cpu recognizes the inputs required for cache invalidations (hold, ahold, boff , and eads ) as explained later. the cpu does not recognize any other inputs while in the stop grant state. input signals to the cpu are not recognized until 1 clock after stpclk is deasserted (see figure 21). while in the stop grant state, the cpu does not recog- nize transitions on the interrupt signals (smi , nmi, and intr). driving an active edge on either smi or nmi does not guarantee recognition and service of the interrupt request fol- lowing exit from the stop grant state. however, if one of the interrupt signals (smi , nmi, or intr) is driven active while the cpu is in the stop grant state, and held active for at least one clk after stpclk is deasserted, the corresponding interrupt
enhanced am486 microprocessor amd 39 preliminary will be serviced. the enhanced am486 cpu product family requires intr to be held active until the cpu issues an inter- rupt acknowledge cycle to guarantee recognition. this condi- tion also applies to the existing am486 cpus. in the stop grant state, the system can stop or change the clk input. when the clock stops, the cpu enters the stop clock state. the cpu returns to the stop grant state immediately when the clk input is restarted. you must hold the stpclk input low until a stabilized fre- quency has been maintained for at least 1 ms to ensure that the pll has had sufficient time to stabilize. the cpu generates a stop grant bus cycle when en- tering the state from the normal or the auto halt power down state. when the cpu enters the stop grant state from the stop clock state or the stop clock snoop state, the cpu does not generate a stop grant bus cycle. 5.5.3 stop clock state stop clock state is entered from the stop grant state by stopping the clk input (either logic high or logic low). none of the cpu input signals should change state while the clk input is stopped. any transition on an input signal (except in- tr) before the cpu has returned to the stop grant state may result in unpredictable behavior. if intr goes active while the clk input is stopped, and stays active until the cpu issues an interrupt acknowledge bus cycle, it is serviced in the normal manner. system design must ensure the cpu is in the correct state prior to asserting cache invalidation or interrupt signals to the cpu. 5.5.4 auto halt power down state a halt instruction causes the cpu to enter the auto halt power down state. the cpu issues a normal halt bus cycle, and only transitions to the normal state when intr, nmi, smi , reset, or sreset occurs. the system can generate a stpclk while the cpu is in the auto halt power down state. the cpu generates a stop grant bus cycle when it enters the stop grant state from the halt state. when the system deasserts the stpclk inter- rupt, the cpu returns execution to the halt state. the cpu generates a new halt bus cycle when it re-enters the halt state from the stop grant state. 5.5.5 stop clock snoop state (cache invalidations) when the cpu is in the stop grant state or the auto halt power down state, the cpu recognizes hold, ahold, boff , and eads for cache invalidation. when the systems asserts hold, ahold, or boff , the cpu floats the bus accordingly. when the system asserts eads , the cpu transparently enters stop clock snoop state and powers up for one full clock to perform the required cache snoop cycle. if a modified line is snooped, a cache write-back occurs with hitm transitioning active until the completion of the write- back. it then powers down and returns to the previous state. the cpu does not generate a bus cycle when it returns to the previous state. 5.5.6 cache flush state when configured in write-back mode, the processor rec- ognizes flush for copying back modified cache lines to memory in the auto halt power down state or normal state. upon the completion of the cache flush, the pro- cessor returns to its prior state, and regenerates a spe- cial bus cycle, if necessary. 6 sreset function the enhanced am486 microprocessor family supports a soft reset function through the sreset pin. sreset forces the processor to begin execution in a known state. the processor state after sreset is the same as after reset except that the internal caches, cd and nw in cr0, write buffers, smbase registers, and float- ing-point registers retain the values they had prior to sreset, and cache snooping is allowed. the proces- sor starts execution at physical address fffffff0h. sreset can be used to help performance for dos extenders written for the 80286 processor. sreset provides a method to switch from protected to real mode while maintaining the internal caches, cr0, and the fpu state. sreset may not be used in place of reset after power-up. in write-back mode, once sreset is sampled active, the sreset sequence begins on the next instruction boundary (unless flush or reset occur before that boundary). when started, the sreset sequence con- tinues to completion and then normal processor execu- tion resumes, independent of the deassertion of sreset. if a snoop hits a modified line during sreset, a normal write-back cycle occurs. ads is asserted to drive the bus cycles even if sreset is not deasserted. 7 system management mode 7.1 overview the enhanced am486 microprocessor supports four modes: real, virtual, protected, and system manage- ment mode (smm). as an operating mode, smm has a distinct processor environment, interface, and hard- ware/software features. smm lets the system designer add new software controlled features to the computer products that always operate transparent to the oper- ating system (os) and software applications. smm is intended for use only by system firmware, not by appli- cations software or general purpose systems soft ware. the smm architectural extension consists of the follow- ing elements: 1) system management interrupt (smi) hardware interface 2) dedicated and secure memory space (smram) for smi handler code and cpu state (context) data with a
amd 40 enhanced am486 microprocessor context normally consists of the cpu registers that fully represent the processor state. n context switch: a context switch is the process of either saving or restoring the context. the smm dis- cussion refers to the context switch as the process of saving/restoring the context while invoking/exiting smm, respectively. n smsave: a mechanism that saves and restores all internal registers to and from smram. 7.3 system management interrupt processing the system interrupts the normal program execution and invokes smm by generating a system management interrupt (smi) to the cpu. the cpu services the smi by executing the following sequence (see figure 22). 1) the cpu asserts the smiact signal, instructing the sys- tem to enable the smram. 2) the cpu saves its state (internal register) to smram. it starts at the smbase relative address location (see section 7.3.3), and proceeds downward in a stack-like fashion. 3) the cpu switches to the smm processor environment (an external pseudo-real mode). 4) the cpu then jumps to the absolute address of smbase + 8000h in smram to execute the smi han- dler. this smi handler performs the system manage- ment activities. note: if the smram shares the same physical address location with part of the system ram, it is overlaid smram. to preserved cache consistency and correct smm operation in systems using overlaid smram, the cache must be flushed via the flush pin when entering smm. 5) the smi handler then executes the rsm instruction which restores the cpus context from smram, deas- serts the smiact signal, and then returns control to the previously interrupted program execution. smi #1 #2 #3 instr instr instr state save smi handler state restore #4 #5 instr instr smi smiact figure 22. basic smi interrupt service rsm status signal for the system to decode access to that memory space, smiact 3) resume (rsm) instruction, for exiting smm 4) special features, such as i/o restart and i/o instruc- tion information, for transparent power management of i/o peripherals, and auto halt restart 7.2 terminology the following terms are used throughout the discussion of system management mode. n smm: system management mode. the operating environment that the processor (system) enters when servi cing a system management interrupt. n smi: system management interrupt. the is the trig- ger mechanism for the smm interface. when smi is asserted (smi pin asserted low) it causes the pro- cessor to invoke smm. the smi pin is the only means of entering smm. n smi handler: system management mode handler. this is the code that is executed when the processor is in smm. an example application that this code might implement is a power management control or a system control function. n rsm: resume instruction. this instruction is used by the smi handler to exit the smm and return to the interrupted os or application process. n smram: this is the physical memory dedicated to smm. the smi handler code and related data reside in this memory. the processor also uses this mem- ory to store its context before executing the smi han- dler. the operating system and applications should not have access to this memory space. n smbase: a control register that contains the base address that defines the smram space. n context: this term refers to the processor state. the smm discussion refers to the context, or processor state, just before the processor invokes smm. the
enhanced am486 microprocessor amd 41 preliminary tsu thd smi sampled clk clk2 smi rdy figure 24. smi timing for servicing an i/o trap for uses such as fast enabling of external i/o devices, the smsave mode permits the restarting of the i/o instructions and the halt instruction. this is accomplished through i/o trap restart and halt auto halt restart slots. only i/o and halt opcodes are restartable. attempts to restart any other opcode may result in unpredictable behavior. the system management interrupt hardware interface consists of the smi request input and the smiact output used by the system to decode the smram (see figure 23). 7.3.1 system management interrupt processing smi is a falling-edge triggered, non-maskable interrupt re- quest signal. smi is an asynchronous signal, but setup and hold times must be met to guarantee recognition in a specific clock. the smi input does not have to remain active until the interrupt is actually serviced. the smi input needs to remain active for only a single clock if the required setup and hold times are met. smi also works correctly if it is held active for an arbitrary number of clocks (see figure 24). the smi input must be held inactive for at least four clocks after it is asserted to reset the edge-triggered logic. a subse- quent smi may not be recognized if the smi input is not held inactive for at least four clocks after being asserted. smi , like nmi, is not affected by the if bit in the eflags register and is recognized on an instruction boundary. smi does not break locked bus cycles. smi has a higher priority than nmi and is not masked during an nmi. after smi is recognized, the smi signal is masked internally until the rsm instruction is executed and the interrupt ser- vice routine is complete. cpu smiact smi smi interface } figure 23. basic smi hardware interface masking smi prevents recursive calls. if another smi occurs while smi is masked, the pending smi is recog- nized and executed on the next instruction boundary after the current smi comple tes. this instruction bound- ary occurs before execution of the next instruction in the interrupted application code, resulting in back-to-back smi handlers. only one smi signal can be pending while smi is masked. the smi signal is synchronized inter- nally and must be asserted at least three clock periods prior to asserting the rdy signal to guarantee recogni- tion on a specific instruction boundary. this is important for servicing an i/o trap with an smi handler. 7.3.2 smi active (smiact ) smiact indicates that the cpu is operating in smm. the cpu asserts smiact in response to an smi inter- rupt request on the smi pin. smiact is driven active after the cpu has completed all pending write cycles (including emptying the write buffers), and before the first access to smram when the cpu saves (writes) its state (or context) to smram. smiact remains active until the last access to smram when the cpu restores (reads) its state from smram. the smiact signal does not float in response to hold. the smiact signal is used by the system logic to decode smram. the num- ber of cl ocks req uired to complete the smm state save and restore is dependent on system memory perfor- mance. the values shown in figure 25 assume 0 wait- state memory writes (2 clock cycles), 2 C 1 C 1 C 1 burst read cycles, and 0 wait-state non-burst reads (two clock cycles). additionally, it is assumed that the data read during the smm state restore sequence is not cache- able. the minimum time required to enter a smsave smi handler routine for the cpu (from the completion of the interrupted instruction) is given by: latency to start of sml h andler = a + b + c = 161 clocks and the minimum time required to return to the interrupt- ed application (following the final smm instruction be- fore rsm) is given by: latency to continue application = e + f + g = 258 clocks
enhanced am486 microprocessor amd 42 preliminary clk clk2 smi smiact ads rdy t1 t2 normal state state save smm handler state restore normal state e clock-doubled cpu clock-tripled cpu a: last rdy from non-smm transfer to smiact assertion 2 clks minimum 2 clks minimum b: smiact assertion to first ads for smm state save 20 clks minimum 15 clks minimum c: smm state save (dependent on memory performance) 139 clks 100 clks d: smi handler user-determined user-determined e: smm state restore (dependent on memory performance) 236 clks 180 clks f: last rdy from smm transfer to deassertion of smiact 2 clks minimum 2 clks minimum g: smiact deassertion of first non-smm ads 20 clks minimum 20 clks minimum figure 25. smiact timing ss ss ss ss ss ss ss ss d c a b g f 7.3.3 smram the cpu uses the smram space for state save and state restore operations during an smi. the smi han- dler, which also resides in smram, uses the smram space to store code, data, and stacks. in addition, the smi handler can use the smram for system manage- ment information such as the system configuration, con- figuration of a powered-down device, and system designer-specific information. note: access to smram is through the cpu internal cache. to ensure cache consistency and correct oper- ation, always assert the flush pin in the same clock as smi for systems using overlaid smram. the cpu asserts smiact to indicate to the memory con- troller that it is operating in system management mode. the system logic should ensure that only the cpu and smi han- dler have access to this area. alternate bus masters or dma devices trying to access the smram space when smiact is active should be directed to system ram in the respective area. the system logic is minimally required to decode the physical memory address range f rom 38000hC 3ffffh as smram area. the cpu saves its state to the state save area from 3ffffh downward to 3fe00h. after saving its state, the cpu jumps to the address location 38000h to begin executing the smi handler. the system logic can choose to decode a larger area of sm- ram as needed. the size of this smram can be be- tween 32 kbytes and 4 gbytes.the system logic should provide a manual method for switching the smram into system memory space when the cpu is not in smm. this enables initialization of the smram space (i.e., loading smi handler) before executing the smi handler during smm (see figure 26). smram system memory accesses redirected to smram system memory accesses not redirected to smram cpu accesses to system address space used for loading smram normal memory space figure 26. redirecting system memory address to smram
enhanced am486 microprocessor amd 43 preliminary 7.3.4 smram state save map when smi is recognized on an instruction boundary, the cpu core first sets the smiact signal low, indicating to the system logic that accesses are now being made to the system-defined smram areas. the cpu then writes its state to the state save area in the smram. the state save area starts at smbase + [8000h + 7fffh]. the default cs base is 30000h; therefore, the default state save area is at 3ffffh. in this case, the cs base is also referred to as the smb ase. table 10. smram state save map register offset* register writable? 7ffch cro no 7ff8h cr3 no 7ff4h eflags ye s 7ff0h eip ye s 7fech edi ye s 7fe8h esi ye s 7fe4h ebp ye s 7fe0h esp ye s 7fdch ebx ye s 7fd8h edx ye s 7fd4h ecx ye s 7fd0h eax ye s 7fcch dr6 no 7fc8h dr7 no 7fc4h tr* no 7fc0h ldtr* no 7fbch gs* no 7fb8h fs* no 7fb4h ds* no 7fb0h ss* no 7fach cs* no 7fa8h es* no 7fa7hC7f98h reserved no 7f94h idt base no 7f93hC7f8ch reserved no 7f88h gdt base no 7f87hC7f08h reserved no 7f04h i/o trap word no 7f02h halt auto restart ye s 7f00h i/o trap restart ye s 7efch smm revision identifier ye s 7ef8h state dump base ye s 7ef7hC7e00h reserved no note: *upper 2 bytes are not mo dified. if the smbase relocation feature is enabled, the sm- ram addresses can change. the following formula is used to determine the relocated addresses where the context is saved: smbase + [8000h + register offset], where the default initial smbase is 30000h and the register offset is listed in table 10. reserved spaces are for new registers in future cpus. some registers in the smram state save area may be read and changed by the smi handler, with the changed values restored to the processor register by the rsm instruction. some register images are read-only, and must not be modified. (modifying these registers results in unpredictable be- havior.) the values stored in the reserved areas may change in future cpus. an smi handler should not rely on values stored in a reserved area. the following registers are written out during smsave mode to the reserved memory locations (7fa7hC 7f98h, 7f93hC7f8ch, and 7f87hC7f08h), but are not visible to the system software programmer: n dr3Cdr0 n cr2 n cs, ds, es, fs, gs, and ss hidden descriptor registers n eip_previous n gdt attributes and limits n idt attributes and limits n ldt attributes, base, and limits n tss attributes, base, and limits if an smi request is issued to power down the cpu, the values of all reserved locations in the smm state save must be saved to non-volatile memory. the following registers are not automatically saved and restored by smi and rsm: n tr7Ctr3 n fpu registers: stn fcs fsw tag word fp instruction pointer fp opcode operand pointer note: you can save the fpu state by using an fsave or fnsave instruction. for all smi requests except for power down suspend/ resume, these registers do not have to be saved be- cause their contents will not change. during a power down suspend/resume, however, a resume reset clears these registers back to their default values. in this case, the suspend smi handler should read these registers directly to save them and restore them during the power up resume. anytime the smi handler changes these registers in the cpu, it must also save and restore them.
enhanced am486 microprocessor amd 44 preliminary 7.4 entering system management mode smm is one of the major operating modes, along with protected mode, real mode, and virtual mode. figure 27 shows how the processor can enter smm from any of the three modes and then return. the external signal smi causes the processor to switch to smm. the rsm instruction exits smm. smm is transparent to applications programs and operating systems for the fol- lowing reasons: n the only way to enter smm is via a type of non- maskable interrupt triggered by an external signal. n the processor begins executing smm code from a separate address space, referred to earlier as sys- tem management ram (smram). n upon entry into smm, the processor saves the reg- ister state of the interrupted program (depending on the save mode) in a part of smram called the smm context save space. n all interrupts normally handled by the operating sys- tem or applications are disabled upon smm entry. n a special instruction, rsm, restores processor reg- isters from the smm context save space and returns control to the interrupted program. similar to real mode, smm has no privilege levels or address mapping. smm programs can execute all i/o and other system instructions and can address up to 4 gbytes of memory. 7.5 exiting system management mode the rsm instruction (opcode 0f aah) leaves smm and returns control to the interrupted program. the rsm instruction can be executed only in smm. an attempt to execute the rsm instruction outside of smm generates an invalid opcode exception. when the rsm instruction is executed and the processor detects invalid state in- formation during the reloading of the save state, the virtual mode system management mode reset reset or rsm smi rsm rsm vm=1 pe=1 reset or pe=0 vm=0 figure 27. transition to and from smm real mode protected mode smi smi processor enters the shutdown state. this occurs in the following situations: n the value in the state dump base field is not a 32-kbyte aligned address. n a combination of bits in cr0 is illegal: (pg=1 and pe=0) or (nw=1 and cd=0). in shutdown mode, the processor stops executing in- structions until an nmi interrupt is received or reset ini- tialization is invoked. the processor generates a shutdown bus cycle. four smm features can be enabled by writing to control slots in the smram state save area: 1) auto halt restart . it is possible for the smi request to interrupt the halt state. the smi handler can tell the rsm instruction to return control to the halt instruction or to return control to the instruction following the halt instruction by appropriately setting the auto halt re- start slot. the default operation is to restart the halt instruction. 2) i/o trap restart . if the smi was generated on an i/o access to a powered-down device, the smi handler can instruct the rsm instruction to re-execute that i/o instruction by setting the i/o trap restart slot. 3) smbase relocation . the system can relocate the smram by setting the smbase relocation slot in the state save area. the rsm instruction sets smbase in the processor based on the value in the smbase relocation slot. the smbase must be aligned on 32- kbyte boundaries. a reset also causes execution to exit from smm. 7.6 processor environment when an smi signal is recognized on an instruction execution boundary, the processor waits for all stores to complete, in- cluding emptying the write buffers. the final write cycle is com- plete when the system returns rdy or brdy . the processor then drives smiact active, saves its register state to smram space, and begins to execute the smi handler. smi has greater priority than debug exceptions and external interrupts. this means that if more than one of these condi- tions occur at an instruction boundary, only the smi processing occurs. subsequent smi requests are not acknowledged while the processor is in smm. the first smi request that oc- curs while the processor is in smm is latched, and serviced when the processor exits smm with the rsm instruction. only one smi signal is latched by the cpu while it is in smm. when the cpu invokes smm, the cpu core registers are initialized as indicated in table 11.
enhanced am486 microprocessor amd 45 preliminary note : interrupts from int and nmi are disabled on smm entry. the following is a summary of the key features in the smm environment: n real mode style address calculation n 4-gbyte limit checking n if flag is cleared n nmi is disabled n tf flag in eflags is cleared; single step traps are disabled n dr7 is cleared; debug traps are disabled n the rsm instruction no longer generates an invalid op code error n default 16-bit op code, register, and stack use. n all bus arbitration (hold, ahold, boff ) inputs, and bus sizing (bs8 , bs16 ) inputs operate normally while the cpu is in smm. 7.7 executing system management mode handler the processor begins execution of the smi handler at offset 8000h in the cs segment. the cs base is initially 30000h, as shown in table 12. table 11. smm initial cpu core register settings register smm initial state general purpose registers unmodified eflags 0000 0002h cr0 bits 0, 2, 3, and 31 cleared (pe, em, ts, and pg); rest unmodified dr6 unpredictable state dr7 0000 0400h gdtr, ldtr, idtr, tssr unmodified eip 0000 8000h notes : 1. the segment limit check is 4 gbytes instead of the usual 64k. 2. the selector value for cs remains at 3000h even if the smbase is changed. the cs base can be changed using the smm base relo- cation feature . when the smi handler is invoked, the cpus pe and pg bits in cr0 are reset to 0. the pro- cessor is in an environment similar to real mode, but without the 64-kbyte limit checking. however, the de- fault operand size and the defa ult address size are set to 16 bits. the em bit is cleared so that no exceptions are generated. (if the smm was entered from protected mode, the real mode interrupt and exception support is not available.) the smi handler should not use float- ing-point unit instructions until the fpu is properly de- tected (within the smi handler) and the exception support is initialized. because the segment bases (other than cs) are cleared to 0 and the segment limits are set to 4 gbytes, the ad dress space may be treated as a single flat 4-gbyte linear space that is unsegmented. the cpu is still in real mode and when a segment selector is loaded with a 16-bit value, that value is then shifted left by 4 bits and loaded into the segment base cache. in smm, the cpu can access or jump anywhere within the 4-gbyte logical address space. the cpu can also indirectly access or perform a near jump anywhere with- in the 4-gbyte logical ad dress space. table 12. segment register initial states segment register selector base attributes limit 1 cs 2 3000h 30000h 16-bit, expand up 4 gbytes ds 0000h 00000000h 16-bit, expand up 4 gbytes es 0000h 00000000h 16-bit, expand up 4 gbytes fs 0000h 00000000h 16-bit, expand up 4 gbytes gs 0000h 00000000h 16-bit, expand up 4 gbytes ss 0000h 00000000h 16-bit, expand up 4 gbytes
enhanced am486 microprocessor amd 46 preliminary nmi interrupts are blocked on entry to the smi handler. if an nmi request occurs during the smi handler, it is latched and serviced after the processor exits smm. only one nmi request is latched during the smi handler. if an nmi request is pending when the processor exe- cutes the rsm instruction, the nmi is serviced before the next instruction of the interrupted code sequence. although nmi requests are blocked when the cpu en- ters smm, they may be enabled through software by executing an iret instruction. if the smi handler re- quires the use of nmi interrupts, it should invoke a dum- my interrupt service routine to execute an iret instruction. when an iret instruction is executed, nmi interrupt requests are serviced in the same real mode manner in which they are handled outside of smm. 7.7.2 smm revisions identifier the 32-bit smm revision identifier specifies the version of smm and the extensions that are available on the processor. the fields of the smm revision identifiers and bit definitions are shown in tables 13 and 14. bit 17 or 16 indicates whether the feature is supported (1=supported, 0=not supported). the processor al- ways reads the smm revision i dentifier at the time of a restore. the i/o trap extension and smm base re- location bits are fixed. the processor writes these bits out at the time it performs a save state. note: changing the state of the reserved bits may result in unpredictable processor behavior. 7.7.1 exceptions and interrupts with system management mode when the cpu enters smm, it disables intr interrupts, debug, and single step traps by clearing the eflags, dr6, and dr7 registers. this prevents a debug appli- cation from accidentally breaking into an smi handler. this is necessary because the smi handler operates from a distinct address space (smram) and the debug trap does not represent the normal system memory space. for an smi handler to use the debug trap feature of the processor to debug smi handler code, it must first en- sure that an smm compliant debug handler is available. the smi handler must also ensure dr3Cdr0 is saved to be restored later. the debug registers dr3Cdr0 and dr7 must then be initialized with the appropriate values. for the processor to use the single step feature of the processor, it must ensure that an smm compliant single step handler is available and then set the trap flag in the eflags register. if the system design requires the pro- cessor to respond to hardware i ntr requests while in smm, it must ensure that an smm-compliant interrupt handler is available, and then set the interrupt flag in the eflags register (using the sti instruction). software interrupts are not blocked on entry to smm, and the system software designer must provide an smm com- pliant interrupt handler before attempting to execute any software interrupt instructions. note that in smm mode the interrupt vector table has the same properties and location as the real mode vector table. table 13. system management mode revision identifier 31C18 17 16 15C0 reserved smm base relocation i/o trap extension smm revision level 00000000000000 1 1 0000h table 14. smm revision identifier bit definitions bit name description default state state at smm entry state at smm exit notes smm base relocation 1=smm base relocation available 0=smm base relocation unavailable 1 1 0 1 0 no change in state no change in state i/o trap extension 1=i/o trapping available 0=i/o trapping unavailable 1 1 0 1 0 no change in state no change in state
enhanced am486 microprocessor amd 47 preliminary 7.7.3 auto halt restart the auto halt restart slot at register offset (word lo- cation) 7f02h in smram indicates to the smi handler that the smi interrupted the cpu during a halt state; bit 0 of slot 7f02h is set to 1 if the previous instruction was a halt (see figure 28). if the smi did not interrupt the cpu in a halt state, then the smi microcode sets bit 0 of the auto halt restart slot to 0. if the previous instruction was a halt, the smi handler can choose to either set or reset bit 0. if this bit is set to 1, the rsm micro code execution forces the processor to re-enter the halt state. if this bit is set to 0 when the rsm instruction is executed, the processor continues execu- tion with the instruction just after the interrupted halt instruction. if the halt instruction is restarted, the cpu will generate a memory access to fetch the halt in- struction (if it is not in the internal cache), and execute a halt bus cycle. table 15 shows the possible restart configurations. if the interrupted instruction was not a halt instruction (bit 0 is set to 0 in the auto halt restart slot upon smm entry), setting bit 0 to 1 will cause unpredictable behav- ior when the rsm instruction is executed 7.7.4 i/o trap restart the i/o instruction restart slot (register offset 7f00h in smram) gives the smi handler the option of causing the rsm instruction to automatically re-execute the interrupted i/o instruction (see figure 29). table 15. halt auto restart configuration value at entry value at exit processor action on exit 0 0 return to next instruction in interrupted program 0 1 unpredictable 1 0 returns to instruction after halt 1 1 returns to interrupted halt instruction halt auto restart register offset 7f02h reserved 15 1 0 figure 28. auto halt restart register offset . when the rsm instruction is executed, if the i/o instruction re-start slot contains the value 0ffh, then the cpu automat- ically re-executes the l/o instruction that the smi signal trapped. if the i/o instruction restart slot contains the value 00h when the rsm instruction is executed, then the cpu does not re-execute the i/o instruction. the cpu automatically initializes the i/o instruction restart slot to 00h during smm entry. the i/o instruction restart slot should be written only when the processor has generated an smi on an i/o instruc- tion boundary. processor operation is unpredictable when the i/o instruction restart slot is set when the processor is servicing an smi that originated on a non-i/o instruction boundary. if the system executes back-to-back smi requests, the second smi handler must not set the i/o instruction re- start slot. the second back-to-back smi signal will not have the i/o trap word set. 7.7.5 i/o trap word the i/o trap word contains the ad dress of the i/o ac- cess that forced the external chipset to assert smi , whether it was a read or write access, and whether the instruction that caused the access to the i/o address was a valid i/o instruction. table 16 shows the layout. bits 31C16 contain the i/o address that was being ac- cessed at the time smi became active. bits 15C2 are reserved. if the instruction that caused the i/o trap to occur was a valid i/o instruction (in, out, ins, outs, rep ins, or rep outs), the valid i/o instruction bit is set. if it was not a valid i/o instruction, the bit is saved as a 0. for rep instructions, the external chip set should return a valid smi within the first access. bit 0 indicates whether the opcode that was accessing the i/o location was performing either a read (1) or a write (0) operation as indicated by the r/w bit. if an smi occurs and it does not trap an i/o instruction, the contents of the i/o address and r/w bit are unpre- dictable and should not be used. table 16. i/o trap word configuration 31C16 15C2 1 0 i/o address reserved valid i/o instruction r/w 15 0 i/o instruction restart slot register offset 7f00h figure 29. i/o instruction restart register offset
enhanced am486 microprocessor amd 48 preliminary 7.7.6 smm base relocation the enhanced am486 cpu family provides a new con- trol register, smbase. the smram address space can be modified by changing the smbase register before exiting an smi handler routine. smbase can be changed to any 32k-aligned value. (values that are not 32k-aligned cause the cpu to enter the shutdown state when executing the rsm instruction.) smbase is set to the default value of 30000h on reset. if smbase is changed by an smi handler, all subsequent smi re- quests initiate a state save at the new smbase. the smb ase slot in the smm state save area indicates and changes the smi jump vector location and smram save area. when bit 17 of the smm revision identifier is set, then this feature exists and the smram base and consequently, the jump vector, are as indicated by the smm base slot (see table 15). during the execution of the rsm instruction, the cpu reads this slot and initial- izes the cpu to use the new smbase during the next smi. during an smi, the cpu does its context save to the new smram area pointed to by the smbase, stores the current smbase in the smm base slot (offset 7ef8h), and then starts execution of the new jump vec- tor based on the current smbase (see figure 30). the smbase must be a 32-kbyte aligned, 32-bit inte- ger that indicates a base address for the smram con- text save area and the smi jump vector. for example when the processor first powers up, the minimum sm- ram area is from 38000hC3ffffh. the default sm- base is 30000h. as illustrated in figure 31, the starting ad dress of the jump vector is calculated by: smbase + 8000h the starting address for the smram state save area is calculated by: smbase + [8000h + 7fffh] when this feature is enabled, the smram register map is addressed according to the above formula. figure 30. smm base slot offset 31 0 31 0 smm base register offset 7ef8h to change the smram base address and smi jump vector location, smi handler modifies the smbase slot. upon executing an rsm instruction, the processor reads the smbase slot and stores it internally. upon recognition of the next smi request, the processor uses the new smbase slot for the smram dump and smi jump vector. if the modified smb ase slot d oes not con- tain a 32-kbyte aligned value, the rsm microcode caus- es the cpu to enter the shutdown state. 7.8 smm system design considerations 7.8.1 smram interface the hardware designed to control the smram space must follow these guidelines: 1) initialize smram space during system boot up. initial- ization must occur before the first smi occurs. initializa- tion of smram space must include installation of an smi handler and may include installation of related data struc- tures necessary for particular smm applications. the memory controller interfacing smram should provide a means for the initialization code to open the smram space manually. 2) the memory controller must decode a minimum initial smram address space of 38000hC3ffffh. 3) alternate bus masters (such as dma controllers) must not be able to access smram space. the system should allow only the cpu, either through smi or dur- ing initialization, to access smram. 4) to implement a 0-v suspend function, the system must have access to all normal system memory from within an smi handler routine. if the smram overlays normal system memory (see figure 32), there must be a meth- od to access overlaid system memory independently. smi handler entry point smbase + 8000h + 7fffh smram smbase + 8000h smbase start of state save figure 31. sram usage
enhanced am486 microprocessor amd 49 preliminary the recommended configuration is to use a separate (non-overlaid) physical address for smram. this non- overlaid scheme prevents the cpu from improperly ac- cessing the smram or system ram directly or through the cache. figure 33 shows the relative smm timing for non-overlaid smram for systems configured in write- through mode. for systems configured in write-back mode, wb/wt must be driven low (as shown in figure 34) to force caching during smm to be write-through. alternately, caching can be disabled during smm by deasserting ken with smi (as shown in figure 35). when the default smram location is used, however, smram is overlaid with system main memory (at 38000hC3ffffh). for simplicity, system designers may want to use this default address or, they may select another overlaid address range. however, in this case the system control circuitry must use smiact to distin- guish between smram and main system memory, and must restrict smram space access to the cpu only. to maintain cache coherency and to ensure proper system operation in systems configured in write-through mode, the system must flush both the cpu internal cache and any second level caches in response to smiact going low. a system that uses cache during smm must f lush the cache a second time in response to smiact going high (see figure 36). if ken is driven high when flush is asserted, the cache is disabled and a second flush is not required (see figure 37). if the system is configured in write-back mode, the cache must be flushed when smi is asserted and then disabled (see figure 38). 7.8.2 cache flushes the cpu does not unconditionally flush its c ache before entering smm. therefore, the designer must ensure that, for systems using overlaid smram, the cache is flushed upon smm entry, and smm exit if caching is enabled. note: a cache flush in a system configured in write- back mode, requires a minimum of 2050 internal clocks to test the cache for modified data, whether invoked by non-overlaid (no need to flush caches) overlaid (caches must be flushed) normal memory normal memory smram normal memory figure 32. smram location overlaid region smram the flush pin input or the wbinvd instruction, and therefore invokes a performance penalty. there is no flush penalty for systems configured in write-through mode. if the flush at smm entry is not done, the first smm read could hit in a cache that contains normal memory space code/data instead of the required smi handler and the handler could not be executed. if the cache is not dis- abled and cache is not flushed at smm exit, the normal read cycles after smm may hit in a cache that may con- tains smm code/data instead of the normal system memory contents. in write-through mode, assert the flush signal in re- sponse to the assertion of smiact at smm entry, and, if required because the cache is enabled, assert flush again in response to the deassertion of smiact at smm exit (see figures 36 and 37). for systems configured in write-back mode, assert flush with smi (see figure 38). reloading the state registers at the end of smm restores cache functionality to its pre-smm state. 7.8.3 a20m pin systems based on the ms-dos operating system con- tain a feature that enables the cpu address bit a20 to be forced to 0. this limits physical memory to a maxi- mum of 1 mbyte, and is provided to ensure compatibility with those programs that relied on the physical address wrap around functionality of the original ibm pc. the a20m pin on enhanced am486 cpus provides this function. when a20m is active, all external bus cycles drive a20 low, and all internal cache accesses are per- formed with a20 low. the a20m pin is recognized while the cpu is in smm. the functionality of the a20m input must be recognized in two inst ances: 1) if the smi handler needs to access system memory space above 1 mbyte (for example, when saving mem- ory to disk for a zero-volt suspend), the a20m pin must be deasserted before the memory above 1 mbyte is addressed. 2) if smram has been relocated to address space above 1 mbyte, and a20m is active upon entering smm, the cpu attempts to access smram at the relocated ad- dress, but with a20 low. this could cause the system to crash, because there would be no valid smm inter- rupt handler at the accessed location. to account for these two situations, the system designer must ensure that a20m is deasserted on entry to smm. a20m must be driven inactive before the first cycle of the smm state save, and must be returned to its original level after the last cycle of the smm state restore. this can be done by blocking the assertion of a20m when smiact is active.
amd 50 preliminary enhanced am486 microprocessor state save smi handler state resume normal cycle rsm smi smiact figure 33. smm timing in systems using non-overlaid memory space and write-through mode with caching enabled during smm figure 34. smm timing in systems using non-overlaid memory spaces and write-back mode with caching enabled during smm state save smi handler state resume normal cycle rsm smi smiact note: for proper operation of systems configured in write -back mode when ca ching during smm is allowed, force wb/wt low to force all caching to be write-t hrough during smm. wb/wt figure 35. smm timing in systems using non-overlaid memory spaces and write-back mode with caching disabled during smm state save smi handler state resume normal cycle rsm smi smiact ken
51 amd preliminary enhanced am486 microprocessor figure 36. smm timing in systems using overlaid memory space and write-through mode with caching enabled during smm state save smi handler state resume normal cycle rsm smi smiact flush smi instruction x instruction x+1 cache contents invalidated cache contents invalidated figure 37. smm timing in systems using overlaid memory spaces and write-through mode with caching disabled during smm state save smi handler state resume normal cycle rsm smi smiact flush smi instruction x instruction x+1 cache contents invalidated ken figure 38. smm timing in systems using overlaid memory spaces and configured in write-back mode smi smiact ken flush rsm state save smi handler state resume normal cycle cache flush state cache must be empty
enhanced am486 microprocessor amd 52 preliminary 7.8.4 cpu reset during smm the system designer should take into account the fol- lowing restrictions while implementing the cpu reset logic: 1) when running software written for the 80286 cpu, a cpu reset switches the cpu from protected mode to real mode. reset and sreset have a higher pri- ority than smi . when the cpu is in smm, the sreset to the cpu during smm should be blocked until the cpu exits smm. sreset must be blocked beginning from the time when smi is driven active. care should be taken not to block the global system reset, which may be necessary to recover from a system crash. 2) during execution of the rsm instruction to exit smm, there is a small time window between the deassertion of smiact and the completion of the rsm micro code. if a protected mode to real mode sreset is asserted during this window, it is possible that the smram space will be violated. the system designer must guar- antee that sreset is blocked until at least 20 cpu clock cycles after smiact has been driven inactive or until the start of a bus cycle. 3) any request for a cpu reset for the purpose of switching the cpu from protected mode to real mode must be acknowledged after the cpu has exited smm. to maintain software transparency, the system logic must latch any sreset signals that are blocked dur- ing smm. for these reasons, the sreset signal should be used for any soft resets, and the reset signal should be used for all hard resets. 7.8.5 smm and second level write buffers before the processor enters smm, it empties its internal write buffers. this is to ensure that the data in the write buffers is written to normal memory space, not smm space. when the cpu is ready to begin writing an smm state save to smram, it asserts smiact . smiact may be driven active by the cpu before the system memory controller has had an opportunity to empty the second level write buffers. to prevent the data from these second level write buffers from being written to the wrong location, the system memory controller needs to direct the memory write cy- cles to either smm space or normal memory space. this can be accomplished by saving the status of smiact with the address for each word in the write buffers. 7.8.6 nested smi and i/o restart special care must be taken when executing an smi han- dler for the purpose of restarting an l/o instruction. when the cpu executes a resume (rsm) instruction with the l/o restart slot set, the restored eip is modified to point to the instruction immediately preceding the smi request, so that the l/o instruction can be re-executed. if a new smi request is received while the cpu is executing an smi handler, the cpu services this smi request before restarting the original i/o instruction. if the i/o restart slot is set when the cpu executes the rsm instruction for the second smi handler, the rsm micro code decrements the restored eip again. eip then points to an address different from the originally interrupted instruction, and the cpu be- gins execution at an incorrect entry point. to prevent this from occurring, the smi handler routine must not set the i/o restart slot during the second of two consecutive smi handlers. 7.9 smm software considerations 7.9.1 smm code considerations the default operand size and the default address size are 16 bits; however, operand-size override and ad- dress-size override prefixes can be used as needed to directly access data anywhere within the 4-gbyte logical address space. with operand-size override prefixes, the smi handler can use jumps, calls, and returns, to transfer a control to any location within the 4-gbyte space. note, however, the following restrictions: 1) any control transfer that does not have an operand- size override prefix truncates eip to 16 low-order bits. 2) due to the real mode style of base-address formation, a long jump or call cannot transfer control segment with a base address of more than 20 bits (1 mbyte). 7.9.2 exception handling upon entry into smm, external interrupts that require handlers are disabled (the if in eflags is cleared). this is necessary because, while the processor is in smm, it is running in a separate memory space. con- sequently, the vectors stored in the interrupt descriptor table (idt) for the prior mode are not applicable. before allowing exception handling (or software interrupts), the smm program must initialize new interrupt and excep- tion vectors. the interrupt vector table for smm has the same format as for real mode. until the interrupt vector table is correctly initialized, the smi handler must not generate an exception (or software interrupt). even though hardware interrupts are disabled, exceptions and software interrupts can still occur. only a correctly written smi handler can prevent internal exceptions. when new exception vectors are initialized, internal ex- ceptions can be servi ced. the restrictions follow: 1) due to the real mode style of base address formation, an interrupt or exception cannot transfer control to a seg- ment with a base address of more than 20 bits. 2) an interrupt or exception cannot transfer control to a segment offset of more than 16 bits. 3) if exceptions or interrupts are allowed to occur, only the low order 16 bits of the return address are pushed
enhanced am486 microprocessor amd 53 preliminary onto the stack. if the offset of the interrupted procedure is greater than 64 kbytes, it is not possible for the in- terrupt/exception handler to return control to that pro- cedure. (one work-around is to perform software adjustment of the return address on the stack.) 4) the smbase relocation feature affects the way the cpu returns from an interrupt or exception during an smi handler. note: the execution of an iret instruction enables non-maskable interrupt (nmi) processing. 7.9.3 halt during smm halt should not be executed during smm, unless in- terrupts have been enabled. interrupts are disabled on entry to smm. intr and nmi are the only events that take the cpu out of halt within smm. 7.9.4 relocating smram to an address above 1 mbyte within smm (or real mode), the segment base registers can be updated only by changing the segment register. the segment registers contain only 16 bits, which allows only 20 bits to be used for a segment base address (the segment register is shifted left 4 bits to determine the seg- ment base address). if smram is relocated to an address above 1 mbyte, the segment registers can no longer be initialized to point to smram. these areas can still be accessed by using address override prefixes to generate an offset to the correct address. for example, if the smbase has been relo- cated immediately below 16m, the ds and es registers are still initialized to 0000 0000h. data in smram can still be accessed by using 32-bit displacement registers: move esi,ooffxxxxh;64k segment immediately below 16m move ax,ds:[esi] 8 test registers 4 and 5 modifications the cache test registers for the enhanced am486 mi- croprocessor are the same test registers (tr3, tr4, and tr5) provided in earlier am486dx and dx2 micro- processors. tr3 is the cache test data register. tr4, the cache test status register, and tr5, the cache test control register, operate together with tr3. if wb/wt meets the necessary setup tim ing and is sam- pled low on the falling edge of reset, the processor is placed in write-through mode and the test register function is identical to the earlier am486 microproces- sors. if wb/wt meets the necessary setup timing and is sampled high on the falling edge of reset, the pro- cessor is placed in write-back mode and the test regis- ters tr4 and tr5 are modified to support the added write-back cache functionality. tables 17 and 18 show the individual bit functions of these registers. sections 8.1 and 8.2 provide a detailed description of the field functions. note: tr3 has the same functions in both write-through and write-back modes.these functions are identical to the tr3 register functions provided by earlier am486 microprocessors. 8.1 tr4 definition this section includes a detailed description of the bit fields defined for tr4. note: bits listed in tables 17 as reserved or not used are not included in these descriptions. n tag (bits 31C11): read/write, always available in write-through mode. available only when ext=0 in tr5 in write-back mode. for a cache write, this is the tag that specifies the address in memory. on a cache look-up, this is tag for the selected entry in the cache. table 17. test register (tr4) 31 30C29 28 27C26 25C24 23C22 21C20 19 18 17 16 15C11 10 9C7 6C3 2C0 tag valid lru valid (rd) not used not used stn rsvd. st3 st2 st1 st0 reserved not used valid lru valid (rd) not used table 18. test register (tr5) 31C20 19 18C17 16 15C11 10C4 3C2 1C0 write-back not used ext set state reserved not used index entry control writ e-through not used index entry control note: the values of stn, st3Cst0, and set state are: 00 = invalid; 01 = exclusive; 10 = modified; 11 = shared ext = 0 ext = 1 note: if ext = 0, tr4 has the standard 486 processor definition for write-through cache.
enhanced am486 microprocessor amd 54 preliminary n stn (bits 30C29): read only, available only in write- back mode when ext=1 in tr5. stn returns the sta- tus of the set (st3, st2, st1, or st0) specified by the tr5 set state field (bits 18C17) during cache look-ups. returned values are: 00 = invalid 01 = exclusive 10 = modified 11 = shared. n st3 (bits 27C26): read only, available only in write- back mode when ext=1 in tr5. st3 returns the sta- tus of set 3 during cache look-ups. returned values are: 00 = invalid 01 = exclusive 10 = modified 11 = shared n st2 (bits 25C24): read only, available only in write- back mode when ext=1 in tr5. st2 returns the sta- tus of set 2 during cache look-ups. returned values are: 00 = invalid 01 = exclusive 10 = modified 11 = shared n st1 (bits 23C22): read only, available only in write- back mode when ext=1 in tr5. st1 returns the sta- tus of set 1 during cache look-ups. returned values are: 00 = invalid 01 = exclusive 10 = modified 11 = shared n st0 (bits 21C20): read only, available only in write- back mode when ext=1 in tr5. st0 returns the sta- tus of set 0 during cache look-ups. returned values are: 00 = invalid 01 = exclusive 10 = modified 11 = shared n valid (bit 10): read/write, independent of the ext bit in tr5. this is the valid bit for the accessed entry. on a cache look-up, valid is a copy of one of the bits reported in bits 6C3. on a cache write in write- through mode, valid becomes the new valid bit for the selected entry and set. in write-back mode, writ- ing to the valid bit has no effect and is ignored; the set state bit locations in tr5 are used to set the valid bit for the selected entry and set. n lru (bits 9C7): read only, independent of the ext bit in tr5. on a cache look-up, these are the three lru bits of the accessed set. on a cache write, these bits are ignored; the lru bits in the cache are updated by the pseudo-lru cache replacement al- gorithm. write operations to these locations have no effect on the device. n valid (bits 6C3): read only, independent of the ext bit in tr5. on a cache look-up, these are the four valid bits of the accessed set. in write-back mode, these valid bits are set if a cache set is in the exclu- sive, modified, or shared state. write operations to these locations have no effect on the device. 8.2 tr5 definition this section includes a detailed description of the bit fields in the tr4. note: bits listed in tables 18 as reserved or not used are not included in the descriptions. n ext (bit 19): read/write, available only in write-back mode. ext, or extension, determines which bit fields are defined for tr4: the ad dress tag field, or the stn and st3Cst0 status bit fields. in write-through mode, the ext bit is not accessible. the following describes the two states of ext: ext = 0, bits 31C11 of tr4 contain the tag ad- dress ext = 1, bits 30C29 of tr4 contain stn, bits 27C 20 contain st3Cst0 n set state (bits 18C17): read/write, available only in write-back mode. the set state field is used to change the mesi state of the set specified by the index and entry bits. the state is set by writing one of the following combinations to this field: 00 = invalid 01 = exclusive 10 = modified 11 = shared n index (bits 10C4): read/write, independent of write- through or write-back mode. index selects one of the 128 sets. n entry (bits 3C2): read/write, independent of write- through or write-back mode. entry selects between one of the four entries in the set addressed by the set select during a cache read or write. during cache fill buffer writes or cache read buffer reads, the value in the entry field selects one of the four doublewords in a cache line.
enhanced am486 microprocessor amd 55 preliminary n control (bits 1C0): read/write, independent of write- through or write-back mode. the control bits deter- mine which operation to performed. the following is a definition of the control operations: 00 = write to cache fill buffer, or read from cache read buffer. 01 = perform cache write. 10 = perform cache read. 11 = flush the cache (mark all entries invalid) 8.3 using tr4 and tr5 for cache testing the following paragraphs provide examples of testing the cache using tr4 and tr5. 8.3.1 example 1: reading the cache (write-back mode only) 1) disable caching by setting the cd bit in the cr0 reg- ister. 2) in tr5, load 0 into the ext field (bit 19), the required index into the index field (bits 10C4), the required entry value into the entry field (bits 3C2), and 10 into the control field (bits 1C0). loading the values into tr5 triggers the cache read. the cache read loads the tr4 register with the tag for the read entry, and the lru and valid bits for the entire set that was read. the cache read loads 128 data bits into the cache read buffer. the entire buffer can be read by placing each of the four binary combinations in the entry field and setting the control field in tr5 to 00 (binary). read each doubleword from the cache read buffer through tr3. 3) reading the set state fields in tr4 during write-back mode is accomplished by setting the ext field in tr5 to 1 and re-reading tr4. 8.3.2 example 2: writing the cache 1) disable the cache by setting the cd bit in the cr0 register. 2) in tr5, load 0 into the ext field (bit 19), the required entry value into the entry field (bits 3C2), and 00 into the control field (bits 1C0). 3) load the tr3 register with the data to write to the cache fill buffer. the cache fill buffer write is triggered by load- ing tr3. 4) repeat steps 2 and 3 for the remaining three double- words in the cache fill buffer. 5) in tr4, load the required values into tag field (bits 31C 11) and the valid field (bit 10). in write-back mode, the valid bit is ignored since the set state field in tr5 is used in place of the tr4 valid bit. the other bits in tr4 (9:0) have no effect on the cache write. 6) in tr5, load 0 into the ext field (bit 19), the required value into the set state field (bits 18C17) (write-back mode only), the required index into the index field (bits 10C4), the required entry value into the entry field (bits 3C2), and 01 into the control field (bits 1C0). loading the values into tr5 triggers the cache write. in write- write-through mode, the set state field is ignored, and the valid bit (bit 10) in tr4 is used instead to define the state of the specified set. 8.3.3 example 3: flushing the cache the cache flush mechanism functions in the same way in write-back and write-through modes. load 11 into the control field (bits 1C0) of tr5. all other fields are ignored, except for ext in write-back mode. the cache flush is triggered by loading the value into tr5. all of the lru bits, valid bits, and set state bits are cleared. 9 enhanced am486 cpu functional differences several important differences exist between the en- hanced am486 microprocessor and the am486dx mi- croprocessor: n the id register contains a different version signa- ture. n the eads function performs cache line write-backs of modified lines to memory in write-back mode. n a burst write feature is available for copy-backs. the flush pin and wbinvd instruction copy-back all mod- ified data to external memory prior to issuing the special bus cycle or reset. 9.1 status after reset the reset state is invoked either after power up or after the reset signal is applied according to the stan- dard am486dx microprocessor specification. 9.2 cache status after reset, the status bits of all lines are set to 0. the lru bits of each set are placed in a starting state.
enhanced am486 microprocessor amd 56 preliminary 10 enhanced am486 cpu identification the enhanced am486 microprocessor supports two standard methods for identifying the cpu in a system. the reported values are dynamically assigned based on the cpu type (dx2 or dx4) and the status of the wb/wt pin input (low = write-through; high = write- back) at reset. 10.1 dx register at reset the dx register al ways contains a compo nent identifier at the conclusion of reset. the upper byte of dx (dh) contains 04 and the lower byte of dx (dl) contains a cpu type/stepping identifier (see table 19). 10.2 cpuid instruction the enhanced am486 microprocessor family imple- ments a new instruction that makes information avail- able to software about the family, model and stepping of the microprocessor on which it is executing. support of this instruction is indicated by the presence of a user- modifiable bit in position eflags.21, referred to as the eflags.id bit. this bit is reset to zero at device reset (reset or sreset) for compatibility with existing pro- cessor designs. 10.2.1 cpuid timing cpuid execution timing depends on the selected eax parameter values (see table 20). 10.2.2 cpuid operation the cpuid instruction requires the user to pass an input parameter to the cpu in the eax register. the cpu response is returned to the user in registers eax, ebx, ecx, and edx. table 19. cpu id codes cpu type and cache mode component id (dh) revision id (dl) dx2 in write-through mode 04 3x dx2 in write-b ack mode 04 7x dx4 in write-through mode 04 8x dx4 in write-b ack mode 04 9x table 20. cpuid instruction description op code instruction eax input value cpu core clocks description 0f a2 cpuid 0 1 >1 41 14 9 amd string cpu id register null registers when the parameter passed in eax is zero, the register values returned upon instruction execution are: the values in ebx, ecx, and edx indicate an amd microprocessor. when taken in the proper order: n ebx (least significant bit to most significant bit) n edx (least significant bit to most significant bit) n ecd (least significant bit to most significant bit) they decode to: authenticamd when the parameter passed in eax is 1, the register values returned are: the value returned in eax after cpuid instruction ex- ecution is identical to the value loaded into edx upon device reset. software must avoid any dependency upon the state of reserved processor bits. when the parameter passed in eax is grea ter than one, register values returned upon instruction execution are: eax[31:0] 00000001h ebx[31:0] 68747541h ecx[31:0] 444d4163h edx[31:0] 69746e65h eax[3:0] stepping id* eax[7:4] model: enhanced am486 dx2 cpu write-through mode = 3h write-back mode = 7h enhanced am486 dx4 cpu write-through mode = 8h write-back mode = 9h eax[11:8] family am486 cpu = 4h eax[15:12] 0000 eax[31:16] reserved ebx[31:0] 00000000h ecx[31:0] 00000000h edx[31:0] 00000001h = all versions the 1 in bit 0 indicates that the fpu is present note: *please contact amd for stepping id details. eax[31:0] 00000000h ebx[31:0] 00000000h ecx[31:0] 00000000h edx[31:0] 00000000h flags affected : no flags are affected. exceptions : none
enhanced am486 microprocessor amd 57 preliminary 11 electrical data the following sections describe recommended electri- cal connections for the enhanced am486 microproces- sors and electrical specifications. 11.1 power and grounding 11.1.1 power connections enhanced am486 microprocessors have modest power requirements. however, the high clock frequency output buffers can cause power surges as multiple output buff- ers drive new signal levels simultaneously. for clean, on-chip power distribution at high frequency, 23 v cc pins and 28 v ss pins feed the microprocessor in the 168-pin pga package. the 208-pin sqfp package includes 53 v cc pins and 38 v ss pins. power and ground connections must be made to all external v cc and v ss pins of the microprocessors. on a circuit board, all v cc pins must connect to a v cc plane. likewise, all v ss pins must connect to a common gnd plane. the enhanced am486 microprocessor family requires only 3.3 v as input power. unlike other 3-v 486 proces- sors, the enhanced am486 microprocessor family does require a v cc5 input of 5 v to indicate the presence of 5-v i/o devices on the system motherboard. for socket compatibility, this pin is inc, allowing the enhanced am486 cpu to operate in 3-v sockets in systems that use 5-v i/o. 11.1.2 power decoupling recommendations liberal decoupling capacitance should be placed near the microprocessor. the microprocessor, driving its 32- bit parallel address and data buses at high frequencies, can cause transient power surges, particularly when driving large capacitive loads. low inductance capacitors and interconnects are rec- ommended for best high-frequency electrical perfor- mance. inductance can be reduced by shortening circuit board traces between the microprocessor and the de- coupling capacitors. capacitors designed specifically for use with pga packages are commercially available. 11.1.3 other connection recommendations for reliable operation, always connect unused inputs to an appropriate signal level. active low inputs should be connected to v cc through a pull-up resistor. pull-ups in the range of 20 k w are recommended. active high in- puts should be connected to gnd.
enhanced am486 microprocessor amd 58 preliminary absolute maximum ratings case temperature under bias . . . C 65c to +110c storage temperature . . . . . . . . . . C 65c to +150c voltage on any pin with respect to ground . . . . . . C 0.5 v to v cc +2.6 v supply voltage with respect to v ss . . . . . . . . . . . . . . . C 0.5 v to +4.6 v stresses above those l isted under absolute maximum ratings may cause permanent device fai lure. funct ionality at or above these limits is not implied. exposure to absolute maximum ratings for extended periods may affect device reliability. operating ranges commercial (c) devices t case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0c to 85c v cc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 v 0.3 v operating ranges define those l imits between which the func- tionality of the device is guaranteed. dc characteristics over commercial operating ranges v cc = 3.3 v 0.3 v; t case = 0c to + 85c symbol parameter min max notes v il input low voltage C 0.3 v +0.8 v v ih input high voltage 2.0 v v cc + 2.4 v v ol output low voltage 0.45 v note 1 v oh output high voltage 2.4 v note 2 i cc power supply current: 66 mhz 75 mhz 80 mhz 100 mhz 120 mhz 660 ma 750 ma 800 ma 1000 ma 1200 ma typical supply current: 528 ma @ 66 mhz, 600 ma @ 75 mhz, 640 ma @ 80 mhz, 800 ma @ 100 mhz, and 960 ma @ 120 mhz inputs at rails, outputs unlo aded. i ccstopgrant or i ccautohalt input current in stop grant or auto halt mode: 66 mhz 75 mhz 80 mhz 100 mhz 120 mhz 66 ma 75 ma 80 ma 100 ma 120 ma typical supply cur rent for stop grant or auto halt mode: 20 ma @ 66 mhz and 75 mhz, 30 ma @ 80 mhz, 50 ma @ 100 mhz, and 60 ma @ 120 mhz. i ccstpclk input current in stop clock mode 5 ma typical supply current in stop clock mode is 600 m a. i li input leakage current 15 a note 3 i ih input leakage current 200 a note 4 i il input leakage current C 400 a note 5 i lo output leakage current 15 a c in input capacitance 10 pf f c = 1 mhz (note 6) c o i/o or output capacitance 14 pf f c = 1 mhz (note 6) c clk clk capacitance 12 pf f c = 1 mhz (note 6) notes: 1. this parameter is measured at: address, data, be n = 4.0 ma; definition, control = 5.0 ma 2. this parameter is measured at: address, data, be n = - 1.0 ma; definition, control = - 0.9 ma 3. this parameter is for inputs without internal pull-ups or pull-downs and 0 v in v cc . 4. this parameter is for inputs with internal pull-downs and v ih = 2.4 v. 5. this parameter is for inputs with internal pull-ups and v il = 0.45 v. 6. not 100% tested.
enhanced am486 microprocessor amd 59 preliminary the ac specifications, provided in the ac characteris- tics table, consists of output delays, input setup require- ments, and input hold requirements. all ac specifications are relative to the rising edge of the clk signal. ac specifications measurement is defined by figure 36. all timings are referenced to 1.5 v unless otherwise specified. enhanced am486 microprocessor output delays are specified with minimum and maximum limits, measured as shown. the minimum microproces- sor delay times are hold times provided to external cir- cuitry. input setup and hold times are specified as minimums, defining the smallest acceptable sampling window. within the sampling window, a synchronous input signal must be stable for correct microprocessor operation. switching characteristics over commercial operating ranges switching characteristics for 33 mhz bus (66 mhz or 100 mhz operating frequency) v cc = 3.3 v 0.3 v; t case = 0c to + 85c; c l = 50 pf unless otherwise specified symbol parameter min max unit figure notes frequency 8 33 mhz note 2 t 1 clk period 30 125 ns 39 t 1a clk period stability 0.1% d adjacent clocks notes 3 and 4 t 2 clk high time at 2 v 11 ns 39 note 3 t 3 clk low time at 0.8 v 11 ns 39 note 3 t 4 clk fall time (2 vC0.8 v) 3 ns 39 note 3 t 5 clk rise time (0.8 vC2 v) 3 ns 39 note 3 t 6 a31Ca2, pwt, pcd, be3 Cbe0 , m/io , d/c , cache , w/r , ads , lock , ferr , breq, hlda, smiact , hitm valid delay 3 14 ns 40 note 5 t 7 a31Ca2, pwt, pcd, be3 Cbe0 , m/io , d/c , cache , w/r , ads , lock float delay 3 20 ns 41 note 3 t 8 pchk valid delay 3 14 ns 42 t 8a blast , plock , valid delay 3 14 ns 40 t 9 blast , plock , float delay 3 20 ns 41 note 3 t 10 d31Cd0, dp3Cdp0 write data valid delay 3 14 ns 40 t 11 d31Cd0, dp3Cdp0 write data float delay 3 20 ns 41 note 3 t 12 eads , inv, wb/wt setup time 5 ns 43 t 13 eads , inv, wb/wt hold time 3 ns 43 t 14 ken , bs16 , bs8 setup time 5 ns 43 t 15 ken , bs16 , bs8 hold time 3 ns 43 t 16 rdy , brdy setup time 5 ns 44 t 17 rdy , brdy hold time 3 ns 44 t 18 hold, ahold setup time 6 ns 43 t 18a boff setup time 7 ns 43 t 19 hold, ahold, boff hold time 3 ns 43 t 20 reset, flush , a20m , nmi, intr, ignne , stpclk , sreset, smi setup time 5 ns 43 note 5 t 21 reset, flush , a20m , nmi, intr, ignne , stpclk , sreset, smi hold time 3 ns 43 note 5 t 22 d31Cd0, dp3Cdp0, a31Ca4 read setup time 5 ns 43, 44 t 23 d32Cd0, dp3Cdp0, a31Ca4 read hold time 3 ns 43, 44 notes: 1. specifications assume c l = 50 pf. i/o buffer model must be used to determine delays due to loading (trace and component). first order i/o buffer models for the processor are available. 2. 0 mhz operation guaranteed during stop clock operation. 3. not 100% tested. guaranteed by design characterization. 4. for faster transitions (>0.1% between adjacent clocks), use the stop clock protocol to switch operating frequency. 5. all timings are referenced at 1.5 v (as illustrated in the listed figures) unless otherwise noted.
enhanced am486 microprocessor amd 60 preliminary switching characteristics for 40 mhz bus (80 mhz or 120 mhz operating frequency) v cc = 3.3 v 0.3 v; t case = 0c to + 85c; c l = see note 1 symbol parameter min max unit figure notes frequency 8 40 mhz note 2 t 1 clk period 25 125 ns 39 t 1a clk period stability 0.1% d adjacent clocks notes 3 and 4 t 2 clk high time at 2 v 9 ns 39 note 3 t 3 clk low time at 0.8 v 9 ns 39 note 3 t 4 clk fall time (2 vC0.8 v) 3 ns 39 note 3 t 5 clk rise time (0.8 vC2 v) 3 ns 39 note 3 t 6 a31Ca2, pwt, pcd, be3 Cbe0 , m/io , d/c , cache , w/r , ads , lock , ferr , breq, hlda, smiact , hitm valid delay 3 14 ns 40 note 5 t 7 a31Ca2, pwt, pcd, be3 Cbe0 , m/io , d/c , cache , w/r , ads , lock float delay 3 18 ns 41 note 3 t 8 pchk valid delay 3 16 ns 42 t 8a blast , plock , valid delay 3 18 ns 40 t 9 blast , plock , float delay 3 16 ns 41 note 3 t 10 d31Cd0, dp3Cdp0 write data valid delay 3 16 ns 40 t 11 d31Cd0, dp3Cdp0 write data float delay 3 18 ns 41 note 3 t 12 eads , inv, wb/wt setup time 5 ns 43 t 13 eads , inv, wb/wt hold time 3 ns 43 t 14 ken , bs16 , bs8 setup time 5 ns 43 t 15 ken , bs16 , bs8 hold time 3 ns 43 t 16 rdy , brdy setup time 5 ns 44 t 17 rdy , brdy hold time 3 ns 44 t 18 hold, ahold setup time 6 ns 43 t 18a boff setup time 8 ns 43 t 19 hold, ahold, boff hold time 3 ns 43 t 20 reset, flush , a20m , nmi, intr, ignne , stpclk , sreset, smi setup time 5 ns 43 note 5 t 21 reset, flush , a20m , nmi, intr, ignne , stpclk , sreset, smi hold time 3 ns 43 note 5 t 22 d31Cd0, dp3Cdp0, a31Ca4 read setup time 5 ns 43, 44 t 23 d32Cd0, dp3Cdp0, a31Ca4 read hold time 3 ns 43, 44 notes: 1. specifications assume c l = 50 pf. i/o buffer model must be used to determine delays due to loading (trace and component). first order i/o buffer models for the processor are available. 2. 0 mhz operation guaranteed during stop clock operation. 3. not 100% tested. guaranteed by design characterization. 4. for faster transitions (>0.1% between adjacent clocks), use the stop clock protocol to switch operating frequency. 5. all timings are referenced at 1.5 v (as illustrated in the listed figures) unless otherwise noted.
enhanced am486 microprocessor amd 61 preliminary switching characteristics for 25 mhz bus (75 mhz operating frequency) v cc = 3.3 v 0.3 v; t case = 0c to + 85c; c l = see note 1 symbol parameter min max unit figure notes frequency 8 25 mhz note 2 t 1 clk period 40 125 ns 39 t 1a clk period stability 0.1% d adjacent clocks notes 3 and 4) t 2 clk high time at 2 v 14 ns 39 note 3 t 3 clk low time at 0.8 v 14 ns 39 note 3 t 4 clk fall time (2 vC0.8 v) 4 ns 39 note 3 t 5 clk rise time (0.8 vC2 v) 4 ns 39 note 5 t 6 a31Ca2, pwt, pcd, be3 Cbe0 , m/io , d/c , cache , w/r , ads , lock , ferr , breq, hlda, smiact , hitm valid delay 3 19 ns 40 note 4 t 7 a31Ca2, pwt, pcd, be3 Cbe0 , m/io , d/c , cache , w/r , ads , lock float delay 3 28 ns 41 note 3 t 8 pchk valid delay 3 24 ns 42 t 8a blast , plock , valid delay 3 24 ns 40 t 9 blast , plock , float delay 3 28 ns 41 note 3 t 10 d31Cd0, dp3Cdp0 write data valid delay 3 20 ns 40 t 11 d31Cd0, dp3Cdp0 write data float delay 3 28 ns 41 note 3 t 12 eads , inv, wb/wt setup time 8 ns 43 t 13 eads , inv, wb/wt hold time 3 ns 43 t 14 ken , bs16 , bs8 setup time 8 ns 43 t 15 ken , bs16 , bs8 hold time 3 ns 43 t 16 rdy , brdy setup time 8 ns 44 t 17 rdy , brdy hold time 3 ns 44 t 18 hold, ahold setup time 8 ns 43 t 18a boff setup time 8 ns 43 t 19 hold, ahold, boff hold time 3 ns 43 t 20 reset, flush , a20m , nmi, intr, ignne , stpclk , sreset, smi setup time 8 ns 43 note 5 t 21 reset, flush , a20m , nmi, intr, ignne , stpclk , sreset, smi hold time 3 ns 43 note 5 t 22 d31Cd0, dp3Cdp0, a31Ca4 read setup time 5 ns 43, 44 t 23 d32Cd0, dp3Cdp0, a31Ca4 read hold time 3 ns 43, 44 notes: 1. specifications assume c l = 50 pf. i/o buffer model must be used to determine delays due to loading (trace and component). first order i/o buffer models for the processor are available. 2. 0 mhz operation guaranteed during stop clock operation. 3. not 100% tested. guaranteed by design characterization. 4. for faster transitions (>0.1% between adjacent clocks), use the stop clock protocol to switch operating frequency. 5. all timings are referenced at 1.5 v (as illustrated in the listed figures) unless otherwise noted.
enhanced am486 microprocessor amd 62 preliminary enhanced am486 microprocessor ac characteristics for boundary scan test signals at 25 mhz v cc = 3.3 v 0.3 v; t case = 0c to +85c; c l = 50 pf unless otherwise specified symbol parameter min max unit figure notes t 24 tck frequency 25 mhz 1x clock t 25 tck period 40 ns 45, 46 note 1 t 26 tck high time at 2 v 10 ns 45 t 27 tck low time at 0.8 v 10 ns 45 t 28 tck rise time (0.8 vC2 v) 4 ns 45 note 2 t 29 tck fall time (2 vC0.8 v) 4 ns 45 note 2 t 30 tdi, tms setup time 8 ns 46 note 3 t 31 tdi, tms hold time 7 ns 46 note 3 t 32 tdo valid delay 3 25 ns 46 note 3 t 33 tdo float delay 36 ns 46 note 3 t 34 all outputs (non-test) valid delay 3 25 ns 46 note 3 t 35 all outputs (non-test) float delay 30 ns 46 note 3 t 36 all inputs (non-test) setup delay 8 ns 46 note 3 t 37 all inputs (non-test) hold time 7 ns 46 note 3 notes: 1. tck period 3 clk period. 2. rise/fall times can be relaxed by 1 ns per 10-ns increase in tck period. 3. parameter measured from tck.
enhanced am486 microprocessor amd 63 preliminary key to switching waveforms waveform inputs outputs must be steady will be steady may change from h to l will change from h to l may change from l to h will change from l to h dont care; any change permitted changing; state unknown does not apply center line is high-impe dance off state figure 40. output valid delay timing figure 39. clk waveforms
enhanced am486 microprocessor amd 64 preliminary figure 41. maximum float delay timing figure 42. pchk valid delay timing
enhanced am486 microprocessor amd 65 preliminary figure 43. input setup and hold timing figure 44. rdy and brdy input setup and hold timing
enhanced am486 microprocessor amd 66 preliminary figure 45. tck waveforms figure 46. test signal timing diagram
amd 67 preliminary enhanced am486 microprocessor 12 package thermal specifications the am486 microprocessor is specified for operation when t case (the case temperature) is within the range of 0 c to +85 c. t case can be measured in any envi- ronment to determine whether the am486 microproces- sor is within specified operating range. the case temperature should be measured at the center of the top surface opposite the pins. the ambient temperature (t a ) is guaranteed as long as t case is not violated. the ambient temperature can be calculated from q jc and q ja and from the following equa- tions: t j = t case + p ? q jc t a = t j C p ? q ja t case = t a + p ? [ q ja C q jc ] where: t j , t a , t case = junction, ambient, and case temperature. q jc , q ja = junction-to-case and j unction-to-ambient thermal resistance, respectively. p = maximum power consumption the values for q ja and q jc are given in table 21 for the 1.75 sq. in., 168-pin, ceramic pga. for the 208-pin sqfp plastic package, q ja = 14.0 and q jc = 1.5. table 22 shows the t a allowable (without exceeding t case ) at various airflows and operating frequencies (clock). note that t a is greatly improved by attaching a heat sink to the package. p (the maximum power con- sumption) is calculated by using the maximum i cc at 3.3 v as tabulated in the dc characteristics . *0.350 2 high unidirectional heat sink (al alloy 6063-t5, 40 mil fin width, 155 mil center-to-center fin spacing). table 21. thermal resistance (c/w) q jc and q ja for the am486 cpu in 168-pin pga package cooling mechanism q jc q ja vs. airflow-ft/min. (m/sec) 0 (0) 200 (1.01) 400 (2.03) 600 (3.04) 800 (4.06) 1000 (5.07) no heat sink 1.5 16.5 14.0 12.0 10.5 9.5 9.0 heat sink* 2.0 12.0 7.0 5.0 4.0 3.5 3.25 heat sink* and fan 2.0 5.0 4.6 4.2 3.8 3.5 3.25 table 22. maximum t a at various airflows in c t a by cooling type clock airflow-ft/min. (m/sec) 0 (0) 200 (1.01) 400 (2.03) 600 (3.04) 800 (4.06) 1000 (5.07) t a without heat sink 66 mhz 52.3 57.8 62.1 65.4 67.6 68.7 t a with heat sink 66 mhz 63.2 74.1 78.5 80.6 81.7 82.3 75 mhz 60.3 72.6 77.6 80.1 81.3 81.9 80 mhz 58.6 71.8 77.1 79.7 81.0 81.7 100 mhz 52.0 68.5 75.1 78.4 80.1 80.9 120 mhz 45.4 65.2 73.1 77.1 79.1 80.1 t a with heat sink and fan 66 mhz 78.5 79.3 80.2 81.1 81.7 82.3 75 mhz 77.6 78.6 79.6 80.5 81.3 81.9 80 mhz 77.1 78.1 79.2 80.2 81.0 81.7 100 mhz 75.1 76.4 77.7 79.1 80.1 80.9 120 mhz 73.1 74.7 76.3 77.9 79.1 80.1
amd 68 preliminary enhanced am486 microprocessor 13 physical dimensions 1.735 1.765 1.735 1.765 bottom view (pins facing up) base plane seating plane 0.140 0.180 0.110 0.140 0.105 0.125 0.017 0.020 side view 168-pin pga aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a 0.025 0.045 1.595 1.605 1.595 1.605 index corner 0.090 0.110 notes: 1. all measurements are in inches. 2. not to scale. for reference only. 3. bsc is an ansi standard for basic space centering.
69 amd preliminary enhanced am486 microprocessor notes: 1. all measurements are in millimeters unless otherwise noted. 2. not to scale. for reference only. 30.35 30.85 27.90 28.10 25.50 ref pin 156 pin 208 pin 52 pin 104 pin 1 i.d. 27.90 28.10 seating plane 0.50 basic 0.25 min 3.17 3.67 3.80 max. -a- -d- -b- top view side view 25.50 ref 30.35 30.85 208-pin sqfp amd, am386, and am486 are registered trademarks of advanced micro devices, inc. fusionpc is a service mark of advanced micro devices, inc. microsoft is a registered trademark and windows is a trademark of microsoft corp. product names used in this publication are for identification purposes only and may be trademarks of their respective companies. 0.46 0.66


▲Up To Search▲   

 
Price & Availability of ENHANCEDAM486DX4

All Rights Reserved © IC-ON-LINE 2003 - 2022  

[Add Bookmark] [Contact Us] [Link exchange] [Privacy policy]
Mirror Sites :  [www.datasheet.hk]   [www.maxim4u.com]  [www.ic-on-line.cn] [www.ic-on-line.com] [www.ic-on-line.net] [www.alldatasheet.com.cn] [www.gdcy.com]  [www.gdcy.net]


 . . . . .
  We use cookies to deliver the best possible web experience and assist with our advertising efforts. By continuing to use this site, you consent to the use of cookies. For more information on cookies, please take a look at our Privacy Policy. X